Re: [PATCH 2/2] virtio: replace arch_has_restricted_virtio_memory_access()

2022-04-26 Thread Juergen Gross via Virtualization

On 26.04.22 21:51, Heiko Carstens wrote:

On Tue, Apr 26, 2022 at 07:35:43PM +0200, Borislav Petkov wrote:

On Tue, Apr 26, 2022 at 03:40:21PM +0200, Juergen Gross wrote:

  /* protected virtualization */
  static void pv_init(void)
  {
if (!is_prot_virt_guest())
return;
  
+	platform_set_feature(PLATFORM_VIRTIO_RESTRICTED_MEM_ACCESS);


Kinda long-ish for my taste. I'll probably call it:

platform_set()

as it is implicit that it sets a feature bit.


...and platform_clear(), instead of platform_reset_feature() please.


Fine with me.




In any case, yeah, looks ok at a quick glance. It would obviously need
for more people to look at it and say whether it makes sense to them and
whether that's fine to have in generic code but so far, the experience
with cc_platform_* says that it seems to work ok in generic code.


We _could_ convert s390's machine flags to this mechanism. Those flags
are historically per-cpu, but if I'm not mistaken none of them is
performance critical anymore, and those who are could/should probably
transformed to jump labels or alternatives anyway.


I was planning to look at the x86 cpu features to see whether some of
those might be candidates to be switched to platform features instead.


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 2/2] virtio: replace arch_has_restricted_virtio_memory_access()

2022-04-26 Thread Juergen Gross via Virtualization

On 26.04.22 19:35, Borislav Petkov wrote:

On Tue, Apr 26, 2022 at 03:40:21PM +0200, Juergen Gross wrote:

  /* protected virtualization */
  static void pv_init(void)
  {
if (!is_prot_virt_guest())
return;
  
+	platform_set_feature(PLATFORM_VIRTIO_RESTRICTED_MEM_ACCESS);


Kinda long-ish for my taste. I'll probably call it:

platform_set()

as it is implicit that it sets a feature bit.


Okay, fine with me.




diff --git a/arch/x86/mm/mem_encrypt_identity.c 
b/arch/x86/mm/mem_encrypt_identity.c
index b43bc24d2bb6..6043ba6cd17d 100644
--- a/arch/x86/mm/mem_encrypt_identity.c
+++ b/arch/x86/mm/mem_encrypt_identity.c
@@ -40,6 +40,7 @@
  #include 
  #include 
  #include 
+#include 
  
  #include 

  #include 
@@ -566,6 +567,10 @@ void __init sme_enable(struct boot_params *bp)
} else {
/* SEV state cannot be controlled by a command line option */
sme_me_mask = me_mask;
+
+   /* Set restricted memory access for virtio. */
+   platform_set_feature(PLATFORM_VIRTIO_RESTRICTED_MEM_ACCESS);


Huh, what does that have to do with SME?


I picked the function where sev_status is being set, as this seemed to be
the correct place to set the feature bit.

Looking at it in more detail it might be preferable to do it in
sev_setup_arch() instead.


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH V3 6/9] virtio-ccw: implement synchronize_cbs()

2022-04-26 Thread Michael S. Tsirkin
On Wed, Apr 27, 2022 at 11:53:25AM +0800, Jason Wang wrote:
> On Tue, Apr 26, 2022 at 2:30 PM Michael S. Tsirkin  wrote:
> >
> > On Tue, Apr 26, 2022 at 12:07:39PM +0800, Jason Wang wrote:
> > > On Tue, Apr 26, 2022 at 11:55 AM Michael S. Tsirkin  
> > > wrote:
> > > >
> > > > On Mon, Apr 25, 2022 at 11:53:24PM -0400, Michael S. Tsirkin wrote:
> > > > > On Tue, Apr 26, 2022 at 11:42:45AM +0800, Jason Wang wrote:
> > > > > >
> > > > > > 在 2022/4/26 11:38, Michael S. Tsirkin 写道:
> > > > > > > On Mon, Apr 25, 2022 at 11:35:41PM -0400, Michael S. Tsirkin 
> > > > > > > wrote:
> > > > > > > > On Tue, Apr 26, 2022 at 04:29:11AM +0200, Halil Pasic wrote:
> > > > > > > > > On Mon, 25 Apr 2022 09:59:55 -0400
> > > > > > > > > "Michael S. Tsirkin"  wrote:
> > > > > > > > >
> > > > > > > > > > On Mon, Apr 25, 2022 at 10:54:24AM +0200, Cornelia Huck 
> > > > > > > > > > wrote:
> > > > > > > > > > > On Mon, Apr 25 2022, "Michael S. Tsirkin" 
> > > > > > > > > > >  wrote:
> > > > > > > > > > > > On Mon, Apr 25, 2022 at 10:44:15AM +0800, Jason Wang 
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > This patch tries to implement the synchronize_cbs() 
> > > > > > > > > > > > > for ccw. For the
> > > > > > > > > > > > > vring_interrupt() that is called via 
> > > > > > > > > > > > > virtio_airq_handler(), the
> > > > > > > > > > > > > synchronization is simply done via the airq_info's 
> > > > > > > > > > > > > lock. For the
> > > > > > > > > > > > > vring_interrupt() that is called via 
> > > > > > > > > > > > > virtio_ccw_int_handler(), a per
> > > > > > > > > > > > > device spinlock for irq is introduced ans used in the 
> > > > > > > > > > > > > synchronization
> > > > > > > > > > > > > method.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Cc: Thomas Gleixner 
> > > > > > > > > > > > > Cc: Peter Zijlstra 
> > > > > > > > > > > > > Cc: "Paul E. McKenney" 
> > > > > > > > > > > > > Cc: Marc Zyngier 
> > > > > > > > > > > > > Cc: Halil Pasic 
> > > > > > > > > > > > > Cc: Cornelia Huck 
> > > > > > > > > > > > > Signed-off-by: Jason Wang 
> > > > > > > > > > > >
> > > > > > > > > > > > This is the only one that is giving me pause. Halil, 
> > > > > > > > > > > > Cornelia,
> > > > > > > > > > > > should we be concerned about the performance impact 
> > > > > > > > > > > > here?
> > > > > > > > > > > > Any chance it can be tested?
> > > > > > > > > > > We can have a bunch of devices using the same airq 
> > > > > > > > > > > structure, and the
> > > > > > > > > > > sync cb creates a choke point, same as 
> > > > > > > > > > > registering/unregistering.
> > > > > > > > > > BTW can callbacks for multiple VQs run on multiple CPUs at 
> > > > > > > > > > the moment?
> > > > > > > > > I'm not sure I understand the question.
> > > > > > > > >
> > > > > > > > > I do think we can have multiple CPUs that are executing some 
> > > > > > > > > portion of
> > > > > > > > > virtio_ccw_int_handler(). So I guess the answer is yes. 
> > > > > > > > > Connie what do you think?
> > > > > > > > >
> > > > > > > > > On the other hand we could also end up serializing 
> > > > > > > > > synchronize_cbs()
> > > > > > > > > calls for different devices if they happen to use the same 
> > > > > > > > > airq_info. But
> > > > > > > > > this probably was not your question
> > > > > > > >
> > > > > > > > I am less concerned about  synchronize_cbs being slow and more 
> > > > > > > > about
> > > > > > > > the slowdown in interrupt processing itself.
> > > > > > > >
> > > > > > > > > > this patch serializes them on a spinlock.
> > > > > > > > > >
> > > > > > > > > Those could then pile up on the newly introduced spinlock.
> > > > > > > > >
> > > > > > > > > Regards,
> > > > > > > > > Halil
> > > > > > > > Hmm yea ... not good.
> > > > > > > Is there any other way to synchronize with all callbacks?
> > > > > >
> > > > > >
> > > > > > Maybe using rwlock as airq handler?
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > >
> > > > > rwlock is still a shared cacheline bouncing between CPUs and
> > > > > a bunch of ordering instructions.
> > >
> > > Yes, but it should be faster than spinlocks anyhow.
> > >
> > > > > Maybe something per-cpu + some IPIs to run things on all CPUs instead?
> > >
> > > Is this something like a customized version of 
> > > synchronzie_rcu_expedited()?
> >
> > With interrupts running in an RCU read size critical section?
> 
> For vring_interrupt(), yes.
> 
> 
> > Quite possibly that is also an option.
> > This will need a bunch of documentation since this is not
> > a standard use of RCU,
> 
> According to Documentation/RCU/requirements.rst, it looks like a legal case:
> 
> "
> The Linux kernel has interrupts, and RCU read-side critical sections are
> legal within interrupt handlers and within interrupt-disabled regions of
> code, as are invocations of call_rcu().
> "

My problem is it is not clear what data is protected by rcu here.
Nothing is tagged with __rcu or uses rcu_dereference.
We need at least an ack from rcu m

Re: [PATCH V3 6/9] virtio-ccw: implement synchronize_cbs()

2022-04-26 Thread Michael S. Tsirkin
On Wed, Apr 27, 2022 at 11:53:25AM +0800, Jason Wang wrote:
> On Tue, Apr 26, 2022 at 2:30 PM Michael S. Tsirkin  wrote:
> >
> > On Tue, Apr 26, 2022 at 12:07:39PM +0800, Jason Wang wrote:
> > > On Tue, Apr 26, 2022 at 11:55 AM Michael S. Tsirkin  
> > > wrote:
> > > >
> > > > On Mon, Apr 25, 2022 at 11:53:24PM -0400, Michael S. Tsirkin wrote:
> > > > > On Tue, Apr 26, 2022 at 11:42:45AM +0800, Jason Wang wrote:
> > > > > >
> > > > > > 在 2022/4/26 11:38, Michael S. Tsirkin 写道:
> > > > > > > On Mon, Apr 25, 2022 at 11:35:41PM -0400, Michael S. Tsirkin 
> > > > > > > wrote:
> > > > > > > > On Tue, Apr 26, 2022 at 04:29:11AM +0200, Halil Pasic wrote:
> > > > > > > > > On Mon, 25 Apr 2022 09:59:55 -0400
> > > > > > > > > "Michael S. Tsirkin"  wrote:
> > > > > > > > >
> > > > > > > > > > On Mon, Apr 25, 2022 at 10:54:24AM +0200, Cornelia Huck 
> > > > > > > > > > wrote:
> > > > > > > > > > > On Mon, Apr 25 2022, "Michael S. Tsirkin" 
> > > > > > > > > > >  wrote:
> > > > > > > > > > > > On Mon, Apr 25, 2022 at 10:44:15AM +0800, Jason Wang 
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > This patch tries to implement the synchronize_cbs() 
> > > > > > > > > > > > > for ccw. For the
> > > > > > > > > > > > > vring_interrupt() that is called via 
> > > > > > > > > > > > > virtio_airq_handler(), the
> > > > > > > > > > > > > synchronization is simply done via the airq_info's 
> > > > > > > > > > > > > lock. For the
> > > > > > > > > > > > > vring_interrupt() that is called via 
> > > > > > > > > > > > > virtio_ccw_int_handler(), a per
> > > > > > > > > > > > > device spinlock for irq is introduced ans used in the 
> > > > > > > > > > > > > synchronization
> > > > > > > > > > > > > method.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Cc: Thomas Gleixner 
> > > > > > > > > > > > > Cc: Peter Zijlstra 
> > > > > > > > > > > > > Cc: "Paul E. McKenney" 
> > > > > > > > > > > > > Cc: Marc Zyngier 
> > > > > > > > > > > > > Cc: Halil Pasic 
> > > > > > > > > > > > > Cc: Cornelia Huck 
> > > > > > > > > > > > > Signed-off-by: Jason Wang 
> > > > > > > > > > > >
> > > > > > > > > > > > This is the only one that is giving me pause. Halil, 
> > > > > > > > > > > > Cornelia,
> > > > > > > > > > > > should we be concerned about the performance impact 
> > > > > > > > > > > > here?
> > > > > > > > > > > > Any chance it can be tested?
> > > > > > > > > > > We can have a bunch of devices using the same airq 
> > > > > > > > > > > structure, and the
> > > > > > > > > > > sync cb creates a choke point, same as 
> > > > > > > > > > > registering/unregistering.
> > > > > > > > > > BTW can callbacks for multiple VQs run on multiple CPUs at 
> > > > > > > > > > the moment?
> > > > > > > > > I'm not sure I understand the question.
> > > > > > > > >
> > > > > > > > > I do think we can have multiple CPUs that are executing some 
> > > > > > > > > portion of
> > > > > > > > > virtio_ccw_int_handler(). So I guess the answer is yes. 
> > > > > > > > > Connie what do you think?
> > > > > > > > >
> > > > > > > > > On the other hand we could also end up serializing 
> > > > > > > > > synchronize_cbs()
> > > > > > > > > calls for different devices if they happen to use the same 
> > > > > > > > > airq_info. But
> > > > > > > > > this probably was not your question
> > > > > > > >
> > > > > > > > I am less concerned about  synchronize_cbs being slow and more 
> > > > > > > > about
> > > > > > > > the slowdown in interrupt processing itself.
> > > > > > > >
> > > > > > > > > > this patch serializes them on a spinlock.
> > > > > > > > > >
> > > > > > > > > Those could then pile up on the newly introduced spinlock.
> > > > > > > > >
> > > > > > > > > Regards,
> > > > > > > > > Halil
> > > > > > > > Hmm yea ... not good.
> > > > > > > Is there any other way to synchronize with all callbacks?
> > > > > >
> > > > > >
> > > > > > Maybe using rwlock as airq handler?
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > >
> > > > > rwlock is still a shared cacheline bouncing between CPUs and
> > > > > a bunch of ordering instructions.
> > >
> > > Yes, but it should be faster than spinlocks anyhow.
> > >
> > > > > Maybe something per-cpu + some IPIs to run things on all CPUs instead?
> > >
> > > Is this something like a customized version of 
> > > synchronzie_rcu_expedited()?
> >
> > With interrupts running in an RCU read size critical section?
> 
> For vring_interrupt(), yes.
> 
> 
> > Quite possibly that is also an option.
> > This will need a bunch of documentation since this is not
> > a standard use of RCU,
> 
> According to Documentation/RCU/requirements.rst, it looks like a legal case:
> 
> "
> The Linux kernel has interrupts, and RCU read-side critical sections are
> legal within interrupt handlers and within interrupt-disabled regions of
> code, as are invocations of call_rcu().
> "
> 
> And as discussed, synchronize_rcu_expedited() is not friendly to real
> time workload.

I am not sure hotplug removal is relevant for realtime anyway 

Re: [PATCH 1/2] kernel: add platform_has() infrastructure

2022-04-26 Thread Juergen Gross via Virtualization

On 26.04.22 19:31, Borislav Petkov wrote:

On Tue, Apr 26, 2022 at 03:40:20PM +0200, Juergen Gross wrote:

diff --git a/kernel/platform-feature.c b/kernel/platform-feature.c
new file mode 100644
index ..2d52f8442cd5
--- /dev/null
+++ b/kernel/platform-feature.c
@@ -0,0 +1,7 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include 
+#include 
+
+unsigned long __read_mostly platform_features[PLATFORM_FEAT_ARRAY_SZ];


Probably __ro_after_init.


Yes, good idea.




+EXPORT_SYMBOL_GPL(platform_features);


You probably should make that thing static and use only accessors to
modify it in case you wanna change the underlying data structure in the
future.


The question is whether we think that those platform features will be used
in hot paths or not. If so the inline accessors (at least the platform_has()
one) would be preferred IMO.

OTOH really performance critical cases could use static_branch or such.


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH V2] vDPA/ifcvf: allow userspace to suspend a queue

2022-04-26 Thread Jason Wang


在 2022/4/24 19:33, Zhu Lingshan 写道:

Formerly, ifcvf driver has implemented a lazy-initialization mechanism
for the virtqueues, it would store all virtqueue config fields that
passed down from the userspace, then load them to the virtqueues and
enable the queues upon DRIVER_OK.

To allow the userspace to suspend a virtqueue,
this commit passes queue_enable to the virtqueue directly through
set_vq_ready().

This feature requires and this commits implementing all virtqueue
ops(set_vq_addr, set_vq_num and set_vq_ready) to take immediate
actions than lazy-initialization, so ifcvf_hw_enable() is retired.

set_features() should take immediate actions as well.

ifcvf_add_status() is retierd because we should not add
status like FEATURES_OK by ifcvf's decision, this driver should
only set device status upon vdpa_ops.set_status()

To avoid losing virtqueue configurations caused by multiple
rounds of reset(), this commit also refactors thed evice reset
routine, now it simply reset the config handler and the virtqueues,
and only once device-reset().



It looks like the patch tries to do too many things at one run. I'd 
suggest to split them:



1) on-the-fly set via set_vq_ready(), but I don't see a reason why we 
need to change other lazy stuffs, since setting queue_enable to 1 before 
DRIVER_OK won't start the virtqueue anyhow

2) if necessary, converting the lazy stuffs
3) the synchornize_irq() fixes
4) other stuffs

Thanks




Signed-off-by: Zhu Lingshan 
---
  drivers/vdpa/ifcvf/ifcvf_base.c | 150 +++-
  drivers/vdpa/ifcvf/ifcvf_base.h |  16 ++--
  drivers/vdpa/ifcvf/ifcvf_main.c |  81 +++--
  3 files changed, 111 insertions(+), 136 deletions(-)

diff --git a/drivers/vdpa/ifcvf/ifcvf_base.c b/drivers/vdpa/ifcvf/ifcvf_base.c
index 48c4dadb0c7c..bbc9007a6f34 100644
--- a/drivers/vdpa/ifcvf/ifcvf_base.c
+++ b/drivers/vdpa/ifcvf/ifcvf_base.c
@@ -179,20 +179,7 @@ void ifcvf_set_status(struct ifcvf_hw *hw, u8 status)
  
  void ifcvf_reset(struct ifcvf_hw *hw)

  {
-   hw->config_cb.callback = NULL;
-   hw->config_cb.private = NULL;
-
ifcvf_set_status(hw, 0);
-   /* flush set_status, make sure VF is stopped, reset */
-   ifcvf_get_status(hw);
-}
-
-static void ifcvf_add_status(struct ifcvf_hw *hw, u8 status)
-{
-   if (status != 0)
-   status |= ifcvf_get_status(hw);
-
-   ifcvf_set_status(hw, status);
ifcvf_get_status(hw);
  }
  
@@ -213,7 +200,7 @@ u64 ifcvf_get_hw_features(struct ifcvf_hw *hw)

return features;
  }
  
-u64 ifcvf_get_features(struct ifcvf_hw *hw)

+u64 ifcvf_get_device_features(struct ifcvf_hw *hw)
  {
return hw->hw_features;
  }
@@ -280,7 +267,7 @@ void ifcvf_write_dev_config(struct ifcvf_hw *hw, u64 offset,
vp_iowrite8(*p++, hw->dev_cfg + offset + i);
  }
  
-static void ifcvf_set_features(struct ifcvf_hw *hw, u64 features)

+void ifcvf_set_features(struct ifcvf_hw *hw, u64 features)
  {
struct virtio_pci_common_cfg __iomem *cfg = hw->common_cfg;
  
@@ -289,22 +276,22 @@ static void ifcvf_set_features(struct ifcvf_hw *hw, u64 features)
  
  	vp_iowrite32(1, &cfg->guest_feature_select);

vp_iowrite32(features >> 32, &cfg->guest_feature);
+
+   vp_ioread32(&cfg->guest_feature);
  }
  
-static int ifcvf_config_features(struct ifcvf_hw *hw)

+u64 ifcvf_get_features(struct ifcvf_hw *hw)
  {
-   struct ifcvf_adapter *ifcvf;
+   struct virtio_pci_common_cfg __iomem *cfg = hw->common_cfg;
+   u64 features;
  
-	ifcvf = vf_to_adapter(hw);

-   ifcvf_set_features(hw, hw->req_features);
-   ifcvf_add_status(hw, VIRTIO_CONFIG_S_FEATURES_OK);
+   vp_iowrite32(0, &cfg->device_feature_select);
+   features = vp_ioread32(&cfg->device_feature);
  
-	if (!(ifcvf_get_status(hw) & VIRTIO_CONFIG_S_FEATURES_OK)) {

-   IFCVF_ERR(ifcvf->pdev, "Failed to set FEATURES_OK status\n");
-   return -EIO;
-   }
+   vp_iowrite32(1, &cfg->device_feature_select);
+   features |= ((u64)vp_ioread32(&cfg->guest_feature) << 32);
  
-	return 0;

+   return features;
  }
  
  u16 ifcvf_get_vq_state(struct ifcvf_hw *hw, u16 qid)

@@ -331,68 +318,111 @@ int ifcvf_set_vq_state(struct ifcvf_hw *hw, u16 qid, u16 
num)
ifcvf_lm = (struct ifcvf_lm_cfg __iomem *)hw->lm_cfg;
q_pair_id = qid / hw->nr_vring;
avail_idx_addr = &ifcvf_lm->vring_lm_cfg[q_pair_id].idx_addr[qid % 2];
-   hw->vring[qid].last_avail_idx = num;
vp_iowrite16(num, avail_idx_addr);
  
  	return 0;

  }
  
-static int ifcvf_hw_enable(struct ifcvf_hw *hw)

+void ifcvf_set_vq_num(struct ifcvf_hw *hw, u16 qid, u32 num)
  {
-   struct virtio_pci_common_cfg __iomem *cfg;
-   u32 i;
+   struct virtio_pci_common_cfg __iomem *cfg = hw->common_cfg;
  
-	cfg = hw->common_cfg;

-   for (i = 0; i < hw->nr_vring; i++) {
-   if (!hw->vring[i].ready)
-   break;
+   vp_iowrite16(qid, &cfg->queue_s

Re: [PATCH] vDPA/ifcvf: fix uninitialized config_vector warning

2022-04-26 Thread Jason Wang
On Sun, Apr 24, 2022 at 3:36 PM Zhu Lingshan  wrote:
>
> Static checkers are not informed that config_vector is controlled
> by vf->msix_vector_status, which can only be
> MSIX_VECTOR_SHARED_VQ_AND_CONFIG, MSIX_VECTOR_SHARED_VQ_AND_CONFIG
> and MSIX_VECTOR_DEV_SHARED.
>
> This commit uses an "if...elseif...else" code block to tell the
> checkers that it is a complete set, and config_vector can be
> initialized anyway
>
> Signed-off-by: Zhu Lingshan 
> Reviewed-by: Dan Carpenter 

Acked-by: Jason Wang 

> ---
>  drivers/vdpa/ifcvf/ifcvf_main.c | 12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c
> index 4366320fb68d..9172905fc7ae 100644
> --- a/drivers/vdpa/ifcvf/ifcvf_main.c
> +++ b/drivers/vdpa/ifcvf/ifcvf_main.c
> @@ -290,16 +290,16 @@ static int ifcvf_request_config_irq(struct 
> ifcvf_adapter *adapter)
> struct ifcvf_hw *vf = &adapter->vf;
> int config_vector, ret;
>
> -   if (vf->msix_vector_status == MSIX_VECTOR_DEV_SHARED)
> -   return 0;
> -
> if (vf->msix_vector_status == MSIX_VECTOR_PER_VQ_AND_CONFIG)
> -   /* vector 0 ~ vf->nr_vring for vqs, num vf->nr_vring vector 
> for config interrupt */
> config_vector = vf->nr_vring;
> -
> -   if (vf->msix_vector_status ==  MSIX_VECTOR_SHARED_VQ_AND_CONFIG)
> +   else if (vf->msix_vector_status ==  MSIX_VECTOR_SHARED_VQ_AND_CONFIG)
> /* vector 0 for vqs and 1 for config interrupt */
> config_vector = 1;
> +   else if (vf->msix_vector_status == MSIX_VECTOR_DEV_SHARED)
> +   /* re-use the vqs vector */
> +   return 0;
> +   else
> +   return -EINVAL;
>
> snprintf(vf->config_msix_name, 256, "ifcvf[%s]-config\n",
>  pci_name(pdev));
> --
> 2.31.1
>

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 2/2] vdpa/vp_vdpa : add vdpa tool support in vp_vdpa

2022-04-26 Thread Jason Wang


在 2022/4/27 09:58, Cindy Lu 写道:

On Tue, Apr 26, 2022 at 2:05 PM Jason Wang  wrote:

On Mon, Apr 25, 2022 at 2:27 PM Cindy Lu  wrote:

this patch is to add the support for vdpa tool in vp_vdpa
here is the example steps

modprobe vp_vdpa
modprobe vhost_vdpa
echo :00:06.0>/sys/bus/pci/drivers/virtio-pci/unbind
echo 1af4 1041 > /sys/bus/pci/drivers/vp-vdpa/new_id

vdpa dev add name vdpa1 mgmtdev pci/:00:06.0

Signed-off-by: Cindy Lu 
---
  drivers/vdpa/virtio_pci/vp_vdpa.c | 138 +++---
  1 file changed, 106 insertions(+), 32 deletions(-)

diff --git a/drivers/vdpa/virtio_pci/vp_vdpa.c 
b/drivers/vdpa/virtio_pci/vp_vdpa.c
index cce101e6a940..873402977543 100644
--- a/drivers/vdpa/virtio_pci/vp_vdpa.c
+++ b/drivers/vdpa/virtio_pci/vp_vdpa.c
@@ -32,7 +32,8 @@ struct vp_vring {

  struct vp_vdpa {
 struct vdpa_device vdpa;
-   struct virtio_pci_modern_device mdev;
+   /* this is an pointer point to the mdev in vp_vdpa_mgmtdev*/
+   struct virtio_pci_modern_device *mdev;

The code can explain itself, so the comment is redundant.


sure wll remove this.

 struct vp_vring *vring;
 struct vdpa_callback config_cb;
 char msix_name[VP_VDPA_NAME_SIZE];
@@ -41,6 +42,12 @@ struct vp_vdpa {
 int vectors;
  };

+struct vp_vdpa_mgmtdev {
+   struct vdpa_mgmt_dev mgtdev;
+   struct virtio_pci_modern_device mdev;

I think coupling it with mgmt device is probably not good, any reason
we can't allocate it independently?


yes I also think it make code confused,  but we need to init  it in
static int vp_vdpa_probe(struct pci_dev *pdev, const struct pci_device_id *id)
and then  pass it to
static int vp_vdpa_dev_add(struct vdpa_mgmt_dev *v_mdev, const char *name,
const struct vdpa_dev_set_config *add_config)
the only struct we can use is struct vdpa_mgmt_dev *v_mdev,  not sure
if we have
some better choice ?



I think we can just allocate it and assign it to vdpa_mgmt_dev structure?

struct vp_vdpa_mgmtdev {
    stuct virtio_pci_modern_device *mdev;
};

Thanks





Thanks


+   struct vp_vdpa *vp_vdpa;
+};


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v2 1/2] vdpa: add the check for id_table in struct vdpa_mgmt_dev

2022-04-26 Thread Jason Wang


在 2022/4/27 10:01, Cindy Lu 写道:

On Mon, Apr 25, 2022 at 5:00 PM Jason Wang  wrote:

On Mon, Apr 25, 2022 at 2:27 PM Cindy Lu  wrote:

To support the dynamic ids in vp_vdpa, we need to add the check for
id table. If the id table is NULL, will not set the device type

Signed-off-by: Cindy Lu 
---
  drivers/vdpa/vdpa.c | 11 +++
  1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
index 1ea525433a5c..09edd92cede0 100644
--- a/drivers/vdpa/vdpa.c
+++ b/drivers/vdpa/vdpa.c
@@ -492,10 +492,13 @@ static int vdpa_mgmtdev_fill(const struct vdpa_mgmt_dev 
*mdev, struct sk_buff *m
 if (err)
 goto msg_err;

-   while (mdev->id_table[i].device) {
-   if (mdev->id_table[i].device <= 63)
-   supported_classes |= BIT_ULL(mdev->id_table[i].device);
-   i++;
+   if (mdev->id_table != NULL) {
+   while (mdev->id_table[i].device) {
+   if (mdev->id_table[i].device <= 63)
+   supported_classes |=
+   BIT_ULL(mdev->id_table[i].device);
+   i++;
+   }
 }

This will cause 0 to be advertised as the supported classes.

I wonder if we can simply use VIRTIO_DEV_ANY_ID here (and need to
export it to via uAPI probably).

Thanks


like the below one? not sure if this ok to use like this?
static struct virtio_device_id vp_vdpa_id_table[] = {
{ VIRTIO_DEV_ANY_ID, VIRTIO_DEV_ANY_ID },
{ 0 },
};



Something like this.

Thanks






 if (nla_put_u64_64bit(msg, VDPA_ATTR_MGMTDEV_SUPPORTED_CLASSES,
--
2.34.1



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH V3 6/9] virtio-ccw: implement synchronize_cbs()

2022-04-26 Thread Jason Wang
On Tue, Apr 26, 2022 at 2:30 PM Michael S. Tsirkin  wrote:
>
> On Tue, Apr 26, 2022 at 12:07:39PM +0800, Jason Wang wrote:
> > On Tue, Apr 26, 2022 at 11:55 AM Michael S. Tsirkin  wrote:
> > >
> > > On Mon, Apr 25, 2022 at 11:53:24PM -0400, Michael S. Tsirkin wrote:
> > > > On Tue, Apr 26, 2022 at 11:42:45AM +0800, Jason Wang wrote:
> > > > >
> > > > > 在 2022/4/26 11:38, Michael S. Tsirkin 写道:
> > > > > > On Mon, Apr 25, 2022 at 11:35:41PM -0400, Michael S. Tsirkin wrote:
> > > > > > > On Tue, Apr 26, 2022 at 04:29:11AM +0200, Halil Pasic wrote:
> > > > > > > > On Mon, 25 Apr 2022 09:59:55 -0400
> > > > > > > > "Michael S. Tsirkin"  wrote:
> > > > > > > >
> > > > > > > > > On Mon, Apr 25, 2022 at 10:54:24AM +0200, Cornelia Huck wrote:
> > > > > > > > > > On Mon, Apr 25 2022, "Michael S. Tsirkin"  
> > > > > > > > > > wrote:
> > > > > > > > > > > On Mon, Apr 25, 2022 at 10:44:15AM +0800, Jason Wang 
> > > > > > > > > > > wrote:
> > > > > > > > > > > > This patch tries to implement the synchronize_cbs() for 
> > > > > > > > > > > > ccw. For the
> > > > > > > > > > > > vring_interrupt() that is called via 
> > > > > > > > > > > > virtio_airq_handler(), the
> > > > > > > > > > > > synchronization is simply done via the airq_info's 
> > > > > > > > > > > > lock. For the
> > > > > > > > > > > > vring_interrupt() that is called via 
> > > > > > > > > > > > virtio_ccw_int_handler(), a per
> > > > > > > > > > > > device spinlock for irq is introduced ans used in the 
> > > > > > > > > > > > synchronization
> > > > > > > > > > > > method.
> > > > > > > > > > > >
> > > > > > > > > > > > Cc: Thomas Gleixner 
> > > > > > > > > > > > Cc: Peter Zijlstra 
> > > > > > > > > > > > Cc: "Paul E. McKenney" 
> > > > > > > > > > > > Cc: Marc Zyngier 
> > > > > > > > > > > > Cc: Halil Pasic 
> > > > > > > > > > > > Cc: Cornelia Huck 
> > > > > > > > > > > > Signed-off-by: Jason Wang 
> > > > > > > > > > >
> > > > > > > > > > > This is the only one that is giving me pause. Halil, 
> > > > > > > > > > > Cornelia,
> > > > > > > > > > > should we be concerned about the performance impact here?
> > > > > > > > > > > Any chance it can be tested?
> > > > > > > > > > We can have a bunch of devices using the same airq 
> > > > > > > > > > structure, and the
> > > > > > > > > > sync cb creates a choke point, same as 
> > > > > > > > > > registering/unregistering.
> > > > > > > > > BTW can callbacks for multiple VQs run on multiple CPUs at 
> > > > > > > > > the moment?
> > > > > > > > I'm not sure I understand the question.
> > > > > > > >
> > > > > > > > I do think we can have multiple CPUs that are executing some 
> > > > > > > > portion of
> > > > > > > > virtio_ccw_int_handler(). So I guess the answer is yes. Connie 
> > > > > > > > what do you think?
> > > > > > > >
> > > > > > > > On the other hand we could also end up serializing 
> > > > > > > > synchronize_cbs()
> > > > > > > > calls for different devices if they happen to use the same 
> > > > > > > > airq_info. But
> > > > > > > > this probably was not your question
> > > > > > >
> > > > > > > I am less concerned about  synchronize_cbs being slow and more 
> > > > > > > about
> > > > > > > the slowdown in interrupt processing itself.
> > > > > > >
> > > > > > > > > this patch serializes them on a spinlock.
> > > > > > > > >
> > > > > > > > Those could then pile up on the newly introduced spinlock.
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > > Halil
> > > > > > > Hmm yea ... not good.
> > > > > > Is there any other way to synchronize with all callbacks?
> > > > >
> > > > >
> > > > > Maybe using rwlock as airq handler?
> > > > >
> > > > > Thanks
> > > > >
> > > >
> > > > rwlock is still a shared cacheline bouncing between CPUs and
> > > > a bunch of ordering instructions.
> >
> > Yes, but it should be faster than spinlocks anyhow.
> >
> > > > Maybe something per-cpu + some IPIs to run things on all CPUs instead?
> >
> > Is this something like a customized version of synchronzie_rcu_expedited()?
>
> With interrupts running in an RCU read size critical section?

For vring_interrupt(), yes.


> Quite possibly that is also an option.
> This will need a bunch of documentation since this is not
> a standard use of RCU,

According to Documentation/RCU/requirements.rst, it looks like a legal case:

"
The Linux kernel has interrupts, and RCU read-side critical sections are
legal within interrupt handlers and within interrupt-disabled regions of
code, as are invocations of call_rcu().
"

And as discussed, synchronize_rcu_expedited() is not friendly to real
time workload. I think we can simply

1) protect vring_interrupt() with rcu_read_lock()
2) use synchronize_rcu() in synchronize_cbs for ccw

And if we care about the long delay we can use per device srcu to reduce that?

Thanks

> and probably get a confirmation
> from RCU maintainers that whatever assumptions we make
> are guaranteed to hold down the road.
>
> > >
> > > ... and I think classic and device interrupts are differe

Re: [PATCH V3 6/9] virtio-ccw: implement synchronize_cbs()

2022-04-26 Thread Michael S. Tsirkin
On Tue, Apr 26, 2022 at 05:47:17PM +0200, Cornelia Huck wrote:
> On Mon, Apr 25 2022, "Michael S. Tsirkin"  wrote:
> 
> > On Mon, Apr 25, 2022 at 11:53:24PM -0400, Michael S. Tsirkin wrote:
> >> On Tue, Apr 26, 2022 at 11:42:45AM +0800, Jason Wang wrote:
> >> > 
> >> > 在 2022/4/26 11:38, Michael S. Tsirkin 写道:
> >> > > On Mon, Apr 25, 2022 at 11:35:41PM -0400, Michael S. Tsirkin wrote:
> >> > > > On Tue, Apr 26, 2022 at 04:29:11AM +0200, Halil Pasic wrote:
> >> > > > > On Mon, 25 Apr 2022 09:59:55 -0400
> >> > > > > "Michael S. Tsirkin"  wrote:
> >> > > > > 
> >> > > > > > On Mon, Apr 25, 2022 at 10:54:24AM +0200, Cornelia Huck wrote:
> >> > > > > > > On Mon, Apr 25 2022, "Michael S. Tsirkin"  
> >> > > > > > > wrote:
> >> > > > > > > > On Mon, Apr 25, 2022 at 10:44:15AM +0800, Jason Wang wrote:
> >> > > > > > > > > This patch tries to implement the synchronize_cbs() for 
> >> > > > > > > > > ccw. For the
> >> > > > > > > > > vring_interrupt() that is called via 
> >> > > > > > > > > virtio_airq_handler(), the
> >> > > > > > > > > synchronization is simply done via the airq_info's lock. 
> >> > > > > > > > > For the
> >> > > > > > > > > vring_interrupt() that is called via 
> >> > > > > > > > > virtio_ccw_int_handler(), a per
> >> > > > > > > > > device spinlock for irq is introduced ans used in the 
> >> > > > > > > > > synchronization
> >> > > > > > > > > method.
> >> > > > > > > > > 
> >> > > > > > > > > Cc: Thomas Gleixner 
> >> > > > > > > > > Cc: Peter Zijlstra 
> >> > > > > > > > > Cc: "Paul E. McKenney" 
> >> > > > > > > > > Cc: Marc Zyngier 
> >> > > > > > > > > Cc: Halil Pasic 
> >> > > > > > > > > Cc: Cornelia Huck 
> >> > > > > > > > > Signed-off-by: Jason Wang 
> >> > > > > > > > 
> >> > > > > > > > This is the only one that is giving me pause. Halil, 
> >> > > > > > > > Cornelia,
> >> > > > > > > > should we be concerned about the performance impact here?
> >> > > > > > > > Any chance it can be tested?
> >> > > > > > > We can have a bunch of devices using the same airq structure, 
> >> > > > > > > and the
> >> > > > > > > sync cb creates a choke point, same as 
> >> > > > > > > registering/unregistering.
> >> > > > > > BTW can callbacks for multiple VQs run on multiple CPUs at the 
> >> > > > > > moment?
> >> > > > > I'm not sure I understand the question.
> >> > > > > 
> >> > > > > I do think we can have multiple CPUs that are executing some 
> >> > > > > portion of
> >> > > > > virtio_ccw_int_handler(). So I guess the answer is yes. Connie 
> >> > > > > what do you think?
> >> > > > > 
> >> > > > > On the other hand we could also end up serializing 
> >> > > > > synchronize_cbs()
> >> > > > > calls for different devices if they happen to use the same 
> >> > > > > airq_info. But
> >> > > > > this probably was not your question
> >> > > > 
> >> > > > I am less concerned about  synchronize_cbs being slow and more about
> >> > > > the slowdown in interrupt processing itself.
> >> > > > 
> >> > > > > > this patch serializes them on a spinlock.
> >> > > > > > 
> >> > > > > Those could then pile up on the newly introduced spinlock.
> 
> How bad would that be in practice? IIUC, we hit on the spinlock when
> - doing synchronize_cbs (should be rare)
> - processing queue interrupts for devices using per-device indicators
>   (which is the non-preferred path, which I would basically only expect
>   when running on an ancient or non-standard hypervisor)

this one is my concern. I am worried serializing everything on a single lock
will drastically regress performance here.


> - configuration change interrupts (should be rare)
> - during setup, reset, etc. (should not be a concern)
> 
> >> > > > > 
> >> > > > > Regards,
> >> > > > > Halil
> >> > > > Hmm yea ... not good.
> >> > > Is there any other way to synchronize with all callbacks?
> >> > 
> >> > 
> >> > Maybe using rwlock as airq handler?
> >> > 
> >> > Thanks
> >> > 
> >> 
> >> rwlock is still a shared cacheline bouncing between CPUs and
> >> a bunch of ordering instructions.
> >> Maybe something per-cpu + some IPIs to run things on all CPUs instead?
> >
> > ... and I think classic and device interrupts are different enough
> > here ...
> 
> You mean classic (per-device) and adapter interrupts, right?

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH V3 6/9] virtio-ccw: implement synchronize_cbs()

2022-04-26 Thread Cornelia Huck
On Mon, Apr 25 2022, "Michael S. Tsirkin"  wrote:

> On Mon, Apr 25, 2022 at 11:53:24PM -0400, Michael S. Tsirkin wrote:
>> On Tue, Apr 26, 2022 at 11:42:45AM +0800, Jason Wang wrote:
>> > 
>> > 在 2022/4/26 11:38, Michael S. Tsirkin 写道:
>> > > On Mon, Apr 25, 2022 at 11:35:41PM -0400, Michael S. Tsirkin wrote:
>> > > > On Tue, Apr 26, 2022 at 04:29:11AM +0200, Halil Pasic wrote:
>> > > > > On Mon, 25 Apr 2022 09:59:55 -0400
>> > > > > "Michael S. Tsirkin"  wrote:
>> > > > > 
>> > > > > > On Mon, Apr 25, 2022 at 10:54:24AM +0200, Cornelia Huck wrote:
>> > > > > > > On Mon, Apr 25 2022, "Michael S. Tsirkin"  
>> > > > > > > wrote:
>> > > > > > > > On Mon, Apr 25, 2022 at 10:44:15AM +0800, Jason Wang wrote:
>> > > > > > > > > This patch tries to implement the synchronize_cbs() for ccw. 
>> > > > > > > > > For the
>> > > > > > > > > vring_interrupt() that is called via virtio_airq_handler(), 
>> > > > > > > > > the
>> > > > > > > > > synchronization is simply done via the airq_info's lock. For 
>> > > > > > > > > the
>> > > > > > > > > vring_interrupt() that is called via 
>> > > > > > > > > virtio_ccw_int_handler(), a per
>> > > > > > > > > device spinlock for irq is introduced ans used in the 
>> > > > > > > > > synchronization
>> > > > > > > > > method.
>> > > > > > > > > 
>> > > > > > > > > Cc: Thomas Gleixner 
>> > > > > > > > > Cc: Peter Zijlstra 
>> > > > > > > > > Cc: "Paul E. McKenney" 
>> > > > > > > > > Cc: Marc Zyngier 
>> > > > > > > > > Cc: Halil Pasic 
>> > > > > > > > > Cc: Cornelia Huck 
>> > > > > > > > > Signed-off-by: Jason Wang 
>> > > > > > > > 
>> > > > > > > > This is the only one that is giving me pause. Halil, Cornelia,
>> > > > > > > > should we be concerned about the performance impact here?
>> > > > > > > > Any chance it can be tested?
>> > > > > > > We can have a bunch of devices using the same airq structure, 
>> > > > > > > and the
>> > > > > > > sync cb creates a choke point, same as registering/unregistering.
>> > > > > > BTW can callbacks for multiple VQs run on multiple CPUs at the 
>> > > > > > moment?
>> > > > > I'm not sure I understand the question.
>> > > > > 
>> > > > > I do think we can have multiple CPUs that are executing some portion 
>> > > > > of
>> > > > > virtio_ccw_int_handler(). So I guess the answer is yes. Connie what 
>> > > > > do you think?
>> > > > > 
>> > > > > On the other hand we could also end up serializing synchronize_cbs()
>> > > > > calls for different devices if they happen to use the same 
>> > > > > airq_info. But
>> > > > > this probably was not your question
>> > > > 
>> > > > I am less concerned about  synchronize_cbs being slow and more about
>> > > > the slowdown in interrupt processing itself.
>> > > > 
>> > > > > > this patch serializes them on a spinlock.
>> > > > > > 
>> > > > > Those could then pile up on the newly introduced spinlock.

How bad would that be in practice? IIUC, we hit on the spinlock when
- doing synchronize_cbs (should be rare)
- processing queue interrupts for devices using per-device indicators
  (which is the non-preferred path, which I would basically only expect
  when running on an ancient or non-standard hypervisor)
- configuration change interrupts (should be rare)
- during setup, reset, etc. (should not be a concern)

>> > > > > 
>> > > > > Regards,
>> > > > > Halil
>> > > > Hmm yea ... not good.
>> > > Is there any other way to synchronize with all callbacks?
>> > 
>> > 
>> > Maybe using rwlock as airq handler?
>> > 
>> > Thanks
>> > 
>> 
>> rwlock is still a shared cacheline bouncing between CPUs and
>> a bunch of ordering instructions.
>> Maybe something per-cpu + some IPIs to run things on all CPUs instead?
>
> ... and I think classic and device interrupts are different enough
> here ...

You mean classic (per-device) and adapter interrupts, right?

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v2] vduse: Fix NULL pointer dereference on sysfs access

2022-04-26 Thread Michael S. Tsirkin
On Tue, Apr 26, 2022 at 10:37:17PM +0800, Yongji Xie wrote:
> On Tue, Apr 26, 2022 at 10:18 PM Michael S. Tsirkin  wrote:
> >
> > On Tue, Apr 26, 2022 at 10:02:02PM +0800, Yongji Xie wrote:
> > > > This should not be needed, when your module is unloaded, all devices it
> > > > handled should be properly removed by it.
> > > >
> > >
> > > I see. But it's not easy to achieve that currently. Maybe we need
> > > something like DEVICE_NEEDS_RESET support in virtio core.
> >
> > Not sure what the connection is.
> >
> 
> If we want to force remove all working vduse devices during module
> unload, we might need to send a DEVICE_NEEDS_RESET notification to
> device driver to do some cleanup before, e.g., return error for all
> inflight I/Os.
> 
> Thanks,
> Yongji

IMHO DEVICE_NEEDS_RESET won't help much with that, it's more in
case device is still there but needs a reset to start working.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2] vduse: Fix NULL pointer dereference on sysfs access

2022-04-26 Thread Michael S. Tsirkin
On Tue, Apr 26, 2022 at 10:02:02PM +0800, Yongji Xie wrote:
> > This should not be needed, when your module is unloaded, all devices it
> > handled should be properly removed by it.
> >
> 
> I see. But it's not easy to achieve that currently. Maybe we need
> something like DEVICE_NEEDS_RESET support in virtio core.

Not sure what the connection is.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2] vduse: Fix NULL pointer dereference on sysfs access

2022-04-26 Thread Greg KH
On Tue, Apr 26, 2022 at 10:02:02PM +0800, Yongji Xie wrote:
> On Tue, Apr 26, 2022 at 9:34 PM Greg KH  wrote:
> >
> > On Tue, Apr 26, 2022 at 09:17:23PM +0800, Yongji Xie wrote:
> > > On Tue, Apr 26, 2022 at 8:44 PM Greg KH  
> > > wrote:
> > > >
> > > > On Tue, Apr 26, 2022 at 08:30:38PM +0800, Yongji Xie wrote:
> > > > > On Tue, Apr 26, 2022 at 6:26 PM Greg KH  
> > > > > wrote:
> > > > > >
> > > > > > On Tue, Apr 26, 2022 at 05:41:15PM +0800, Yongji Xie wrote:
> > > > > > > On Tue, Apr 26, 2022 at 4:07 PM Greg KH 
> > > > > > >  wrote:
> > > > > > > >
> > > > > > > > On Tue, Apr 26, 2022 at 03:36:56PM +0800, Xie Yongji wrote:
> > > > > > > > > The control device has no drvdata. So we will get a
> > > > > > > > > NULL pointer dereference when accessing control
> > > > > > > > > device's msg_timeout attribute via sysfs:
> > > > > > > > >
> > > > > > > > > [ 132.841881][ T3644] BUG: kernel NULL pointer dereference, 
> > > > > > > > > address: 00f8
> > > > > > > > > [ 132.850619][ T3644] RIP: 0010:msg_timeout_show 
> > > > > > > > > (drivers/vdpa/vdpa_user/vduse_dev.c:1271)
> > > > > > > > > [ 132.869447][ T3644] dev_attr_show (drivers/base/core.c:2094)
> > > > > > > > > [ 132.870215][ T3644] sysfs_kf_seq_show (fs/sysfs/file.c:59)
> > > > > > > > > [ 132.871164][ T3644] ? device_remove_bin_file 
> > > > > > > > > (drivers/base/core.c:2088)
> > > > > > > > > [ 132.872082][ T3644] kernfs_seq_show (fs/kernfs/file.c:164)
> > > > > > > > > [ 132.872838][ T3644] seq_read_iter (fs/seq_file.c:230)
> > > > > > > > > [ 132.873578][ T3644] ? __vmalloc_area_node 
> > > > > > > > > (mm/vmalloc.c:3041)
> > > > > > > > > [ 132.874532][ T3644] kernfs_fop_read_iter 
> > > > > > > > > (fs/kernfs/file.c:238)
> > > > > > > > > [ 132.875513][ T3644] __kernel_read (fs/read_write.c:440 
> > > > > > > > > (discriminator 1))
> > > > > > > > > [ 132.876319][ T3644] kernel_read (fs/read_write.c:459)
> > > > > > > > > [ 132.877129][ T3644] kernel_read_file 
> > > > > > > > > (fs/kernel_read_file.c:94)
> > > > > > > > > [ 132.877978][ T3644] kernel_read_file_from_fd 
> > > > > > > > > (include/linux/file.h:45 fs/kernel_read_file.c:186)
> > > > > > > > > [ 132.879019][ T3644] __do_sys_finit_module 
> > > > > > > > > (kernel/module.c:4207)
> > > > > > > > > [ 132.879930][ T3644] __ia32_sys_finit_module 
> > > > > > > > > (kernel/module.c:4189)
> > > > > > > > > [ 132.880930][ T3644] do_int80_syscall_32 
> > > > > > > > > (arch/x86/entry/common.c:112 arch/x86/entry/common.c:132)
> > > > > > > > > [ 132.881847][ T3644] entry_INT80_compat 
> > > > > > > > > (arch/x86/entry/entry_64_compat.S:419)
> > > > > > > > >
> > > > > > > > > To fix it, don't create the unneeded attribute for
> > > > > > > > > control device anymore.
> > > > > > > > >
> > > > > > > > > Fixes: c8a6153b6c59 ("vduse: Introduce VDUSE - vDPA Device in 
> > > > > > > > > Userspace")
> > > > > > > > > Reported-by: kernel test robot 
> > > > > > > > > Cc: sta...@vger.kernel.org
> > > > > > > > > Signed-off-by: Xie Yongji 
> > > > > > > > > ---
> > > > > > > > >  drivers/vdpa/vdpa_user/vduse_dev.c | 7 +++
> > > > > > > > >  1 file changed, 3 insertions(+), 4 deletions(-)
> > > > > > > > >
> > > > > > > > > diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
> > > > > > > > > b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > > > > > index f85d1a08ed87..160e40d03084 100644
> > > > > > > > > --- a/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > > > > > +++ b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > > > > > @@ -1344,9 +1344,9 @@ static int vduse_create_dev(struct 
> > > > > > > > > vduse_dev_config *config,
> > > > > > > > >
> > > > > > > > >   dev->minor = ret;
> > > > > > > > >   dev->msg_timeout = VDUSE_MSG_DEFAULT_TIMEOUT;
> > > > > > > > > - dev->dev = device_create(vduse_class, NULL,
> > > > > > > > > -  MKDEV(MAJOR(vduse_major), 
> > > > > > > > > dev->minor),
> > > > > > > > > -  dev, "%s", config->name);
> > > > > > > > > + dev->dev = device_create_with_groups(vduse_class, NULL,
> > > > > > > > > + MKDEV(MAJOR(vduse_major), 
> > > > > > > > > dev->minor),
> > > > > > > > > + dev, vduse_dev_groups, "%s", 
> > > > > > > > > config->name);
> > > > > > > > >   if (IS_ERR(dev->dev)) {
> > > > > > > > >   ret = PTR_ERR(dev->dev);
> > > > > > > > >   goto err_dev;
> > > > > > > > > @@ -1595,7 +1595,6 @@ static int vduse_init(void)
> > > > > > > > >   return PTR_ERR(vduse_class);
> > > > > > > > >
> > > > > > > > >   vduse_class->devnode = vduse_devnode;
> > > > > > > > > - vduse_class->dev_groups = vduse_dev_groups;
> > > > > > > >
> > > > > > > > Ok, this looks much better.
> > > > > > > >
> > > > > > > > But wow, there are some problems in this code overall.  I see a 
> > > > > > > > number
> > > > > > > > of flat-out-wrong things in there that should have been caught 
> > > > > > > > by code
>

[PATCH 0/2] kernel: add new infrastructure for platform_has() support

2022-04-26 Thread Juergen Gross via Virtualization
In another patch series [1] the need has come up to have support for
a generic feature flag infrastructure.

This patch series is introducing that infrastructure and adds the first
use case.

I have decided to use a similar interface as the already known x86
cpu_has() function. As the new infrastructure is meant to be usable for
general and arch-specific feature flags, the flags are being spread
between a general bitmap and an arch specific one.

The bitmaps start all being zero, single features can be set or reset
at any time by using the related platform_[re]set_feature() functions.

The platform_has() function is using a simple test_bit() call for now,
further optimization might be added when needed.

[1]: 
https://lore.kernel.org/lkml/1650646263-22047-1-git-send-email-olekst...@gmail.com/T/#t

Juergen Gross (2):
  kernel: add platform_has() infrastructure
  virtio: replace arch_has_restricted_virtio_memory_access()

 MAINTAINERS|  8 +++
 arch/s390/Kconfig  |  1 -
 arch/s390/mm/init.c| 13 +++
 arch/x86/Kconfig   |  1 -
 arch/x86/kernel/cpu/mshyperv.c |  5 -
 arch/x86/mm/mem_encrypt.c  |  6 --
 arch/x86/mm/mem_encrypt_identity.c |  5 +
 drivers/virtio/Kconfig |  6 --
 drivers/virtio/virtio.c|  5 ++---
 include/asm-generic/Kbuild |  1 +
 include/asm-generic/platform-feature.h |  8 +++
 include/linux/platform-feature.h   | 30 ++
 include/linux/virtio_config.h  |  9 
 kernel/Makefile|  2 +-
 kernel/platform-feature.c  |  7 ++
 15 files changed, 69 insertions(+), 38 deletions(-)
 create mode 100644 include/asm-generic/platform-feature.h
 create mode 100644 include/linux/platform-feature.h
 create mode 100644 kernel/platform-feature.c

-- 
2.34.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 2/2] virtio: replace arch_has_restricted_virtio_memory_access()

2022-04-26 Thread Juergen Gross via Virtualization
Instead of using arch_has_restricted_virtio_memory_access() together
with CONFIG_ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS, replace those
with platform_has() and a new platform feature
PLATFORM_VIRTIO_RESTRICTED_MEM_ACCESS.

Signed-off-by: Juergen Gross 
---
I've only done a compile test on x86 for now, as I can't test these
changes easily (SEV might be doable for me, but s390 isn't).
---
 arch/s390/Kconfig  |  1 -
 arch/s390/mm/init.c| 13 +++--
 arch/x86/Kconfig   |  1 -
 arch/x86/kernel/cpu/mshyperv.c |  5 -
 arch/x86/mm/mem_encrypt.c  |  6 --
 arch/x86/mm/mem_encrypt_identity.c |  5 +
 drivers/virtio/Kconfig |  6 --
 drivers/virtio/virtio.c|  5 ++---
 include/linux/platform-feature.h   |  3 ++-
 include/linux/virtio_config.h  |  9 -
 10 files changed, 16 insertions(+), 38 deletions(-)

diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index e084c72104f8..f97a22ae69a8 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -772,7 +772,6 @@ menu "Virtualization"
 config PROTECTED_VIRTUALIZATION_GUEST
def_bool n
prompt "Protected virtualization guest support"
-   select ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS
help
  Select this option, if you want to be able to run this
  kernel as a protected virtualization KVM guest.
diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
index 86ffd0d51fd5..8e4fa10c6b12 100644
--- a/arch/s390/mm/init.c
+++ b/arch/s390/mm/init.c
@@ -31,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -168,22 +169,14 @@ bool force_dma_unencrypted(struct device *dev)
return is_prot_virt_guest();
 }
 
-#ifdef CONFIG_ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS
-
-int arch_has_restricted_virtio_memory_access(void)
-{
-   return is_prot_virt_guest();
-}
-EXPORT_SYMBOL(arch_has_restricted_virtio_memory_access);
-
-#endif
-
 /* protected virtualization */
 static void pv_init(void)
 {
if (!is_prot_virt_guest())
return;
 
+   platform_set_feature(PLATFORM_VIRTIO_RESTRICTED_MEM_ACCESS);
+
/* make sure bounce buffers are shared */
swiotlb_force = SWIOTLB_FORCE;
swiotlb_init(1);
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index b0142e01002e..20ac72546ae4 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1515,7 +1515,6 @@ config X86_CPA_STATISTICS
 config X86_MEM_ENCRYPT
select ARCH_HAS_FORCE_DMA_UNENCRYPTED
select DYNAMIC_PHYSICAL_MASK
-   select ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS
def_bool n
 
 config AMD_MEM_ENCRYPT
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index 4b67094215bb..435611d83895 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -347,8 +348,10 @@ static void __init ms_hyperv_init_platform(void)
 #endif
/* Isolation VMs are unenlightened SEV-based VMs, thus this 
check: */
if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT)) {
-   if (hv_get_isolation_type() != HV_ISOLATION_TYPE_NONE)
+   if (hv_get_isolation_type() != HV_ISOLATION_TYPE_NONE) {
cc_set_vendor(CC_VENDOR_HYPERV);
+   
platform_set_feature(PLATFORM_VIRTIO_RESTRICTED_MEM_ACCESS);
+   }
}
}
 
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 50d209939c66..9b6a7c98b2b1 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -76,9 +76,3 @@ void __init mem_encrypt_init(void)
 
print_mem_encrypt_feature_info();
 }
-
-int arch_has_restricted_virtio_memory_access(void)
-{
-   return cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT);
-}
-EXPORT_SYMBOL_GPL(arch_has_restricted_virtio_memory_access);
diff --git a/arch/x86/mm/mem_encrypt_identity.c 
b/arch/x86/mm/mem_encrypt_identity.c
index b43bc24d2bb6..6043ba6cd17d 100644
--- a/arch/x86/mm/mem_encrypt_identity.c
+++ b/arch/x86/mm/mem_encrypt_identity.c
@@ -40,6 +40,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -566,6 +567,10 @@ void __init sme_enable(struct boot_params *bp)
} else {
/* SEV state cannot be controlled by a command line option */
sme_me_mask = me_mask;
+
+   /* Set restricted memory access for virtio. */
+   platform_set_feature(PLATFORM_VIRTIO_RESTRICTED_MEM_ACCESS);
+
goto out;
}
 
diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
index b5adf6abd241..a6dc8b5846fe 100644
--- a/drivers/virtio/Kconfig
+++ b/drivers/virtio/Kconfig
@@ -6,12 +6,6 @@ config VIRTIO
  bus, such as CONFIG_VIRTIO_PCI, CONFIG_VIRTIO_MMIO, CONFIG_RPMSG
  or CONFIG_S390_GUEST.
 
-conf

[PATCH 1/2] kernel: add platform_has() infrastructure

2022-04-26 Thread Juergen Gross via Virtualization
Add a simple infrastructure for setting, resetting and querying
platform feature flags.

Flags can be either global or architecture specific.

Signed-off-by: Juergen Gross 
---
 MAINTAINERS|  8 +++
 include/asm-generic/Kbuild |  1 +
 include/asm-generic/platform-feature.h |  8 +++
 include/linux/platform-feature.h   | 29 ++
 kernel/Makefile|  2 +-
 kernel/platform-feature.c  |  7 +++
 6 files changed, 54 insertions(+), 1 deletion(-)
 create mode 100644 include/asm-generic/platform-feature.h
 create mode 100644 include/linux/platform-feature.h
 create mode 100644 kernel/platform-feature.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 5e8c2f611766..eb943f089eda 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -15650,6 +15650,14 @@ S: Maintained
 F: Documentation/devicetree/bindings/iio/chemical/plantower,pms7003.yaml
 F: drivers/iio/chemical/pms7003.c
 
+PLATFORM FEATURE INFRASTRUCTURE
+M: Juergen Gross 
+S: Maintained
+F: arch/*/include/asm/platform-feature.h
+F: include/asm-generic/platform-feature.h
+F: include/linux/platform-feature.h
+F: kernel/platform-feature.c
+
 PLDMFW LIBRARY
 M: Jacob Keller 
 S: Maintained
diff --git a/include/asm-generic/Kbuild b/include/asm-generic/Kbuild
index 302506bbc2a4..8e47d483b524 100644
--- a/include/asm-generic/Kbuild
+++ b/include/asm-generic/Kbuild
@@ -44,6 +44,7 @@ mandatory-y += msi.h
 mandatory-y += pci.h
 mandatory-y += percpu.h
 mandatory-y += pgalloc.h
+mandatory-y += platform-feature.h
 mandatory-y += preempt.h
 mandatory-y += rwonce.h
 mandatory-y += sections.h
diff --git a/include/asm-generic/platform-feature.h 
b/include/asm-generic/platform-feature.h
new file mode 100644
index ..4b0af3d51588
--- /dev/null
+++ b/include/asm-generic/platform-feature.h
@@ -0,0 +1,8 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_GENERIC_PLATFORM_FEATURE_H
+#define _ASM_GENERIC_PLATFORM_FEATURE_H
+
+/* Number of arch specific feature flags. */
+#define PLATFORM_ARCH_FEAT_N   0
+
+#endif /* _ASM_GENERIC_PLATFORM_FEATURE_H */
diff --git a/include/linux/platform-feature.h b/include/linux/platform-feature.h
new file mode 100644
index ..df393d502a4f
--- /dev/null
+++ b/include/linux/platform-feature.h
@@ -0,0 +1,29 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _PLATFORM_FEATURE_H
+#define _PLATFORM_FEATURE_H
+
+#include 
+#include 
+
+/* The platform features are starting with the architecture specific ones. */
+#define PLATFORM_FEAT_N(0 + 
PLATFORM_ARCH_FEAT_N)
+
+#define PLATFORM_FEAT_ARRAY_SZ BITS_TO_LONGS(PLATFORM_FEAT_N)
+extern unsigned long platform_features[PLATFORM_FEAT_ARRAY_SZ];
+
+static inline void platform_set_feature(unsigned int feature)
+{
+   set_bit(feature, platform_features);
+}
+
+static inline void platform_reset_feature(unsigned int feature)
+{
+   clear_bit(feature, platform_features);
+}
+
+static inline bool platform_has(unsigned int feature)
+{
+   return test_bit(feature, platform_features);
+}
+
+#endif /* _PLATFORM_FEATURE_H */
diff --git a/kernel/Makefile b/kernel/Makefile
index 847a82bfe0e3..2f412f80110d 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -7,7 +7,7 @@ obj-y = fork.o exec_domain.o panic.o \
cpu.o exit.o softirq.o resource.o \
sysctl.o capability.o ptrace.o user.o \
signal.o sys.o umh.o workqueue.o pid.o task_work.o \
-   extable.o params.o \
+   extable.o params.o platform-feature.o \
kthread.o sys_ni.o nsproxy.o \
notifier.o ksysfs.o cred.o reboot.o \
async.o range.o smpboot.o ucount.o regset.o
diff --git a/kernel/platform-feature.c b/kernel/platform-feature.c
new file mode 100644
index ..2d52f8442cd5
--- /dev/null
+++ b/kernel/platform-feature.c
@@ -0,0 +1,7 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include 
+#include 
+
+unsigned long __read_mostly platform_features[PLATFORM_FEAT_ARRAY_SZ];
+EXPORT_SYMBOL_GPL(platform_features);
-- 
2.34.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2] vduse: Fix NULL pointer dereference on sysfs access

2022-04-26 Thread Greg KH
On Tue, Apr 26, 2022 at 09:17:23PM +0800, Yongji Xie wrote:
> On Tue, Apr 26, 2022 at 8:44 PM Greg KH  wrote:
> >
> > On Tue, Apr 26, 2022 at 08:30:38PM +0800, Yongji Xie wrote:
> > > On Tue, Apr 26, 2022 at 6:26 PM Greg KH  
> > > wrote:
> > > >
> > > > On Tue, Apr 26, 2022 at 05:41:15PM +0800, Yongji Xie wrote:
> > > > > On Tue, Apr 26, 2022 at 4:07 PM Greg KH  
> > > > > wrote:
> > > > > >
> > > > > > On Tue, Apr 26, 2022 at 03:36:56PM +0800, Xie Yongji wrote:
> > > > > > > The control device has no drvdata. So we will get a
> > > > > > > NULL pointer dereference when accessing control
> > > > > > > device's msg_timeout attribute via sysfs:
> > > > > > >
> > > > > > > [ 132.841881][ T3644] BUG: kernel NULL pointer dereference, 
> > > > > > > address: 00f8
> > > > > > > [ 132.850619][ T3644] RIP: 0010:msg_timeout_show 
> > > > > > > (drivers/vdpa/vdpa_user/vduse_dev.c:1271)
> > > > > > > [ 132.869447][ T3644] dev_attr_show (drivers/base/core.c:2094)
> > > > > > > [ 132.870215][ T3644] sysfs_kf_seq_show (fs/sysfs/file.c:59)
> > > > > > > [ 132.871164][ T3644] ? device_remove_bin_file 
> > > > > > > (drivers/base/core.c:2088)
> > > > > > > [ 132.872082][ T3644] kernfs_seq_show (fs/kernfs/file.c:164)
> > > > > > > [ 132.872838][ T3644] seq_read_iter (fs/seq_file.c:230)
> > > > > > > [ 132.873578][ T3644] ? __vmalloc_area_node (mm/vmalloc.c:3041)
> > > > > > > [ 132.874532][ T3644] kernfs_fop_read_iter (fs/kernfs/file.c:238)
> > > > > > > [ 132.875513][ T3644] __kernel_read (fs/read_write.c:440 
> > > > > > > (discriminator 1))
> > > > > > > [ 132.876319][ T3644] kernel_read (fs/read_write.c:459)
> > > > > > > [ 132.877129][ T3644] kernel_read_file (fs/kernel_read_file.c:94)
> > > > > > > [ 132.877978][ T3644] kernel_read_file_from_fd 
> > > > > > > (include/linux/file.h:45 fs/kernel_read_file.c:186)
> > > > > > > [ 132.879019][ T3644] __do_sys_finit_module (kernel/module.c:4207)
> > > > > > > [ 132.879930][ T3644] __ia32_sys_finit_module 
> > > > > > > (kernel/module.c:4189)
> > > > > > > [ 132.880930][ T3644] do_int80_syscall_32 
> > > > > > > (arch/x86/entry/common.c:112 arch/x86/entry/common.c:132)
> > > > > > > [ 132.881847][ T3644] entry_INT80_compat 
> > > > > > > (arch/x86/entry/entry_64_compat.S:419)
> > > > > > >
> > > > > > > To fix it, don't create the unneeded attribute for
> > > > > > > control device anymore.
> > > > > > >
> > > > > > > Fixes: c8a6153b6c59 ("vduse: Introduce VDUSE - vDPA Device in 
> > > > > > > Userspace")
> > > > > > > Reported-by: kernel test robot 
> > > > > > > Cc: sta...@vger.kernel.org
> > > > > > > Signed-off-by: Xie Yongji 
> > > > > > > ---
> > > > > > >  drivers/vdpa/vdpa_user/vduse_dev.c | 7 +++
> > > > > > >  1 file changed, 3 insertions(+), 4 deletions(-)
> > > > > > >
> > > > > > > diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
> > > > > > > b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > > > index f85d1a08ed87..160e40d03084 100644
> > > > > > > --- a/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > > > +++ b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > > > @@ -1344,9 +1344,9 @@ static int vduse_create_dev(struct 
> > > > > > > vduse_dev_config *config,
> > > > > > >
> > > > > > >   dev->minor = ret;
> > > > > > >   dev->msg_timeout = VDUSE_MSG_DEFAULT_TIMEOUT;
> > > > > > > - dev->dev = device_create(vduse_class, NULL,
> > > > > > > -  MKDEV(MAJOR(vduse_major), 
> > > > > > > dev->minor),
> > > > > > > -  dev, "%s", config->name);
> > > > > > > + dev->dev = device_create_with_groups(vduse_class, NULL,
> > > > > > > + MKDEV(MAJOR(vduse_major), 
> > > > > > > dev->minor),
> > > > > > > + dev, vduse_dev_groups, "%s", 
> > > > > > > config->name);
> > > > > > >   if (IS_ERR(dev->dev)) {
> > > > > > >   ret = PTR_ERR(dev->dev);
> > > > > > >   goto err_dev;
> > > > > > > @@ -1595,7 +1595,6 @@ static int vduse_init(void)
> > > > > > >   return PTR_ERR(vduse_class);
> > > > > > >
> > > > > > >   vduse_class->devnode = vduse_devnode;
> > > > > > > - vduse_class->dev_groups = vduse_dev_groups;
> > > > > >
> > > > > > Ok, this looks much better.
> > > > > >
> > > > > > But wow, there are some problems in this code overall.  I see a 
> > > > > > number
> > > > > > of flat-out-wrong things in there that should have been caught by 
> > > > > > code
> > > > > > reviews.  Some examples:
> > > > > > - empty release() callbacks.  That is a huge sign the code
> > > > > >   design is wrong and broken and you are just trying to 
> > > > > > make the
> > > > > >   driver core quiet for some reason.  The documentation in 
> > > > > > the
> > > > > >   kernel explains why this is not ok.
> > > > >
> > > > > Sorry, I failed to find the documentation. Do you mean we should
> > > > > remove the empty release() callbacks?
> > > >
> > > > Yes, why are they needed?
> > > >

Re: [PATCH v2] vduse: Fix NULL pointer dereference on sysfs access

2022-04-26 Thread Greg KH
On Tue, Apr 26, 2022 at 08:30:38PM +0800, Yongji Xie wrote:
> On Tue, Apr 26, 2022 at 6:26 PM Greg KH  wrote:
> >
> > On Tue, Apr 26, 2022 at 05:41:15PM +0800, Yongji Xie wrote:
> > > On Tue, Apr 26, 2022 at 4:07 PM Greg KH  
> > > wrote:
> > > >
> > > > On Tue, Apr 26, 2022 at 03:36:56PM +0800, Xie Yongji wrote:
> > > > > The control device has no drvdata. So we will get a
> > > > > NULL pointer dereference when accessing control
> > > > > device's msg_timeout attribute via sysfs:
> > > > >
> > > > > [ 132.841881][ T3644] BUG: kernel NULL pointer dereference, address: 
> > > > > 00f8
> > > > > [ 132.850619][ T3644] RIP: 0010:msg_timeout_show 
> > > > > (drivers/vdpa/vdpa_user/vduse_dev.c:1271)
> > > > > [ 132.869447][ T3644] dev_attr_show (drivers/base/core.c:2094)
> > > > > [ 132.870215][ T3644] sysfs_kf_seq_show (fs/sysfs/file.c:59)
> > > > > [ 132.871164][ T3644] ? device_remove_bin_file 
> > > > > (drivers/base/core.c:2088)
> > > > > [ 132.872082][ T3644] kernfs_seq_show (fs/kernfs/file.c:164)
> > > > > [ 132.872838][ T3644] seq_read_iter (fs/seq_file.c:230)
> > > > > [ 132.873578][ T3644] ? __vmalloc_area_node (mm/vmalloc.c:3041)
> > > > > [ 132.874532][ T3644] kernfs_fop_read_iter (fs/kernfs/file.c:238)
> > > > > [ 132.875513][ T3644] __kernel_read (fs/read_write.c:440 
> > > > > (discriminator 1))
> > > > > [ 132.876319][ T3644] kernel_read (fs/read_write.c:459)
> > > > > [ 132.877129][ T3644] kernel_read_file (fs/kernel_read_file.c:94)
> > > > > [ 132.877978][ T3644] kernel_read_file_from_fd 
> > > > > (include/linux/file.h:45 fs/kernel_read_file.c:186)
> > > > > [ 132.879019][ T3644] __do_sys_finit_module (kernel/module.c:4207)
> > > > > [ 132.879930][ T3644] __ia32_sys_finit_module (kernel/module.c:4189)
> > > > > [ 132.880930][ T3644] do_int80_syscall_32 
> > > > > (arch/x86/entry/common.c:112 arch/x86/entry/common.c:132)
> > > > > [ 132.881847][ T3644] entry_INT80_compat 
> > > > > (arch/x86/entry/entry_64_compat.S:419)
> > > > >
> > > > > To fix it, don't create the unneeded attribute for
> > > > > control device anymore.
> > > > >
> > > > > Fixes: c8a6153b6c59 ("vduse: Introduce VDUSE - vDPA Device in 
> > > > > Userspace")
> > > > > Reported-by: kernel test robot 
> > > > > Cc: sta...@vger.kernel.org
> > > > > Signed-off-by: Xie Yongji 
> > > > > ---
> > > > >  drivers/vdpa/vdpa_user/vduse_dev.c | 7 +++
> > > > >  1 file changed, 3 insertions(+), 4 deletions(-)
> > > > >
> > > > > diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
> > > > > b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > index f85d1a08ed87..160e40d03084 100644
> > > > > --- a/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > +++ b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > @@ -1344,9 +1344,9 @@ static int vduse_create_dev(struct 
> > > > > vduse_dev_config *config,
> > > > >
> > > > >   dev->minor = ret;
> > > > >   dev->msg_timeout = VDUSE_MSG_DEFAULT_TIMEOUT;
> > > > > - dev->dev = device_create(vduse_class, NULL,
> > > > > -  MKDEV(MAJOR(vduse_major), dev->minor),
> > > > > -  dev, "%s", config->name);
> > > > > + dev->dev = device_create_with_groups(vduse_class, NULL,
> > > > > + MKDEV(MAJOR(vduse_major), dev->minor),
> > > > > + dev, vduse_dev_groups, "%s", 
> > > > > config->name);
> > > > >   if (IS_ERR(dev->dev)) {
> > > > >   ret = PTR_ERR(dev->dev);
> > > > >   goto err_dev;
> > > > > @@ -1595,7 +1595,6 @@ static int vduse_init(void)
> > > > >   return PTR_ERR(vduse_class);
> > > > >
> > > > >   vduse_class->devnode = vduse_devnode;
> > > > > - vduse_class->dev_groups = vduse_dev_groups;
> > > >
> > > > Ok, this looks much better.
> > > >
> > > > But wow, there are some problems in this code overall.  I see a number
> > > > of flat-out-wrong things in there that should have been caught by code
> > > > reviews.  Some examples:
> > > > - empty release() callbacks.  That is a huge sign the code
> > > >   design is wrong and broken and you are just trying to make the
> > > >   driver core quiet for some reason.  The documentation in the
> > > >   kernel explains why this is not ok.
> > >
> > > Sorry, I failed to find the documentation. Do you mean we should
> > > remove the empty release() callbacks?
> >
> > Yes, why are they needed?
> >
> > (hint, retorical question, you added them to remove the driver core
> > warning when the device is removed, which means someone added them just
> > because they thought that their code could ignore the hints that the
> > driver core was telling them.)
> >
> 
> OK, I see.
> 
> > Please properly free the memory here.
> >
> 
> One question is how to deal with the case if the device/kobject is
> defined as a static variable. We should not need to free any resources
> in this case. Or do you suggest just using dynamic allocation here?

A kobject can NEVER be a static vari

Re: [PATCH v4 7/8] tests/crypto: Add test suite for crypto akcipher

2022-04-26 Thread Daniel P . Berrangé
On Mon, Apr 11, 2022 at 06:43:26PM +0800, zhenwei pi wrote:
> From: lei he 
> 
> Add unit test and benchmark test for crypto akcipher.
> 
> Signed-off-by: lei he 
> Signed-off-by: zhenwei pi 
> ---
>  tests/bench/benchmark-crypto-akcipher.c | 161 ++
>  tests/bench/meson.build |   4 +
>  tests/bench/test_akcipher_keys.inc  | 537 ++
>  tests/unit/meson.build  |   1 +
>  tests/unit/test-crypto-akcipher.c   | 708 
>  5 files changed, 1411 insertions(+)
>  create mode 100644 tests/bench/benchmark-crypto-akcipher.c
>  create mode 100644 tests/bench/test_akcipher_keys.inc
>  create mode 100644 tests/unit/test-crypto-akcipher.c
> 

> diff --git a/tests/bench/test_akcipher_keys.inc 
> b/tests/bench/test_akcipher_keys.inc
> new file mode 100644
> index 00..7adf218135
> --- /dev/null
> +++ b/tests/bench/test_akcipher_keys.inc
> @@ -0,0 +1,537 @@
> +/*
> + * Copyright (c) 2022 Bytedance, and/or its affiliates
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + *
> + * Author: lei he 
> + */
> +
> +/* RSA test keys, generated by OpenSSL */
> +static const uint8_t rsa1024_priv_key[] = {
> +0x30, 0x82, 0x02, 0x5c, 0x02, 0x01, 0x00, 0x02,
> + 0x81, 0x81, 0x00, 0xe6, 0x4d, 0x76, 0x4f, 0xb2,

snip

For the patch as is:

 Reviewed-by: Daniel P. Berrangé 


It could be nice to add another test with some intentionally corrupt
RSA keys with bad DER encoding, as a way to prove that we're handling
errors in DER decoding correctly when faced with malicous data from a
bad guest.

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v4 6/8] crypto: Implement RSA algorithm by gcrypt

2022-04-26 Thread Daniel P . Berrangé
On Mon, Apr 11, 2022 at 06:43:25PM +0800, zhenwei pi wrote:
> From: Lei He 
> 
> Added gcryt implementation of RSA algorithm, RSA algorithm
> implemented by gcrypt has a higher priority than nettle because
> it supports raw padding.
> 
> Signed-off-by: Lei He 
> ---
>  crypto/akcipher-gcrypt.c.inc | 531 +++
>  crypto/akcipher.c|   4 +-
>  2 files changed, 534 insertions(+), 1 deletion(-)
>  create mode 100644 crypto/akcipher-gcrypt.c.inc
> 
> diff --git a/crypto/akcipher-gcrypt.c.inc b/crypto/akcipher-gcrypt.c.inc
> new file mode 100644
> index 00..c109bf0566
> --- /dev/null
> +++ b/crypto/akcipher-gcrypt.c.inc
> @@ -0,0 +1,531 @@
> +/*
> + * QEMU Crypto akcipher algorithms
> + *
> + * Copyright (c) 2022 Bytedance
> + * Author: lei he 
> + *
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * This library is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with this library; if not, see 
> .
> + *
> + */
> +
> +#include 
> +
> +#include 
> +
> +#include "qemu/osdep.h"
> +#include "qemu/host-utils.h"
> +#include "asn1_decoder.h"
> +#include "crypto/akcipher.h"
> +#include "crypto/random.h"
> +#include "qapi/error.h"
> +#include "sysemu/cryptodev.h"
> +#include "rsakey.h"
> +
> +typedef struct QCryptoGcryptRSA {
> +QCryptoAkCipher akcipher;
> +gcry_sexp_t key;
> +QCryptoRSAPaddingAlgorithm padding_alg;
> +QCryptoHashAlgorithm hash_alg;
> +} QCryptoGcryptRSA;
> +
> +static void qcrypto_gcrypt_rsa_destroy(QCryptoGcryptRSA *rsa)
> +{
> +if (!rsa) {
> +return;
> +}
> +
> +gcry_sexp_release(rsa->key);
> +g_free(rsa);
> +}
> +
> +static QCryptoGcryptRSA *qcrypto_gcrypt_rsa_new(
> +const QCryptoAkCipherOptionsRSA *opt,
> +QCryptoAkCipherKeyType type,
> +const uint8_t *key,  size_t keylen,
> +Error **errp);
> +
> +QCryptoAkCipher *qcrypto_akcipher_new(const QCryptoAkCipherOptions *opts,
> +  QCryptoAkCipherKeyType type,
> +  const uint8_t *key, size_t keylen,
> +  Error **errp)
> +{
> +switch (opts->algorithm) {
> +case QCRYPTO_AKCIPHER_ALG_RSA:
> +return (QCryptoAkCipher *)qcrypto_gcrypt_rsa_new(
> +&opts->u.rsa, type, key, keylen, errp);
> +
> +default:
> +error_setg(errp, "Unsupported algorithm: %u", opts->algorithm);
> +return NULL;
> +}
> +
> +return NULL;
> +}
> +
> +static void qcrypto_gcrypt_set_rsa_size(QCryptoAkCipher *akcipher, 
> gcry_mpi_t n)
> +{
> +size_t key_size = (gcry_mpi_get_nbits(n) + 7) / 8;
> +akcipher->max_plaintext_len = key_size;
> +akcipher->max_ciphertext_len = key_size;
> +akcipher->max_dgst_len = key_size;
> +akcipher->max_signature_len = key_size;
> +}
> +
> +static int qcrypto_gcrypt_parse_rsa_private_key(
> +QCryptoGcryptRSA *rsa,
> +const uint8_t *key, size_t keylen)
> +{
> +QCryptoAkCipherRSAKey *rsa_key =
> +qcrypto_akcipher_parse_rsa_private_key(key, keylen);

With my earlier suggestion, you can use

  g_autoptr(QCryptoAkCipherRSAKey) rsa_key = ...

and avoid need for the later  qcrypto_akcipher_free_rsa_key calls.

> +gcry_mpi_t n = NULL, e = NULL, d = NULL, p = NULL, q = NULL, u = NULL;
> +int ret = -1;
> +bool compute_mul_inv = false;
> +gcry_error_t err;
> +if (!rsa_key) {
> +return ret;
> +}
> +
> +err = gcry_mpi_scan(&n, GCRYMPI_FMT_STD,
> +rsa_key->n.data, rsa_key->n.len, NULL);
> +if (gcry_err_code(err) != 0) {
> +goto clear;
> +}
> +
> +err = gcry_mpi_scan(&e, GCRYMPI_FMT_STD,
> +rsa_key->e.data, rsa_key->e.len, NULL);
> +if (gcry_err_code(err) != 0) {
> +goto clear;
> +}
> +
> +err = gcry_mpi_scan(&d, GCRYMPI_FMT_STD,
> +rsa_key->d.data, rsa_key->d.len, NULL);
> +if (gcry_err_code(err) != 0) {
> +goto clear;
> +}
> +
> +err = gcry_mpi_scan(&p, GCRYMPI_FMT_STD,
> +rsa_key->p.data, rsa_key->p.len, NULL);
> +if (gcry_err_code(err) != 0) {
> +goto clear;
> +}
> +
> +err = gcry_mpi_scan(&q, GCRYMPI_FMT_STD,
> +rsa_key->q.data, rsa_key->q.len, NULL);
> +if (gcry_err_code(err) != 0) {
> +goto clear;
> +}
> +
> +if (gcry_mpi_cmp_ui(p, 0) > 0 && gcry_mpi_cmp_ui(q, 0) > 0) {
> + 

Re: [PATCH v4 5/8] crypto: Implement RSA algorithm by hogweed

2022-04-26 Thread Daniel P . Berrangé
On Mon, Apr 11, 2022 at 06:43:24PM +0800, zhenwei pi wrote:
> From: Lei He 
> 
> Implement RSA algorithm by hogweed from nettle. Thus QEMU supports
> a 'real' RSA backend to handle request from guest side. It's
> important to test RSA offload case without OS & hardware requirement.
> 
> Signed-off-by: Lei He 
> Signed-off-by: zhenwei pi 
> ---
>  crypto/akcipher-nettle.c.inc | 448 +++
>  crypto/akcipher.c|   4 +
>  crypto/meson.build   |   4 +
>  crypto/rsakey-builtin.c.inc  | 150 
>  crypto/rsakey-nettle.c.inc   | 141 +++
>  crypto/rsakey.c  |  43 
>  crypto/rsakey.h  |  96 
>  meson.build  |  11 +
>  8 files changed, 897 insertions(+)
>  create mode 100644 crypto/akcipher-nettle.c.inc
>  create mode 100644 crypto/rsakey-builtin.c.inc
>  create mode 100644 crypto/rsakey-nettle.c.inc
>  create mode 100644 crypto/rsakey.c
>  create mode 100644 crypto/rsakey.h
> 
> diff --git a/crypto/akcipher-nettle.c.inc b/crypto/akcipher-nettle.c.inc
> new file mode 100644
> index 00..de163cd89e
> --- /dev/null
> +++ b/crypto/akcipher-nettle.c.inc
> @@ -0,0 +1,448 @@
> +/*
> + * QEMU Crypto akcipher algorithms
> + *
> + * Copyright (c) 2022 Bytedance
> + * Author: lei he 
> + *
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * This library is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with this library; if not, see 
> .
> + *
> + */
> +
> +#include 
> +
> +#include 
> +
> +#include "qemu/osdep.h"
> +#include "qemu/host-utils.h"
> +#include "asn1_decoder.h"
> +#include "crypto/akcipher.h"
> +#include "crypto/random.h"
> +#include "qapi/error.h"
> +#include "sysemu/cryptodev.h"
> +#include "rsakey.h"
> +
> +typedef struct QCryptoNettleRSA {
> +QCryptoAkCipher akcipher;
> +struct rsa_public_key pub;
> +struct rsa_private_key priv;
> +QCryptoRSAPaddingAlgorithm padding_alg;
> +QCryptoHashAlgorithm hash_alg;
> +} QCryptoNettleRSA;
> +
> +static void qcrypto_nettle_rsa_destroy(void *ptr)
> +{
> +QCryptoNettleRSA *rsa = (QCryptoNettleRSA *)ptr;
> +if (!rsa) {
> +return;
> +}
> +
> +rsa_public_key_clear(&rsa->pub);
> +rsa_private_key_clear(&rsa->priv);
> +g_free(rsa);
> +}
> +
> +static QCryptoAkCipher *qcrypto_nettle_rsa_new(
> +const QCryptoAkCipherOptionsRSA *opt,
> +QCryptoAkCipherKeyType type,
> +const uint8_t *key,  size_t keylen,
> +Error **errp);
> +
> +QCryptoAkCipher *qcrypto_akcipher_new(const QCryptoAkCipherOptions *opts,
> +  QCryptoAkCipherKeyType type,
> +  const uint8_t *key, size_t keylen,
> +  Error **errp)
> +{
> +switch (opts->algorithm) {
> +case QCRYPTO_AKCIPHER_ALG_RSA:
> +return qcrypto_nettle_rsa_new(&opts->u.rsa, type, key, keylen, errp);
> +
> +default:
> +error_setg(errp, "Unsupported algorithm: %u", opts->algorithm);
> +return NULL;
> +}
> +
> +return NULL;
> +}
> +
> +static void qcrypto_nettle_rsa_set_akcipher_size(QCryptoAkCipher *akcipher,
> + int key_size)
> +{
> +akcipher->max_plaintext_len = key_size;
> +akcipher->max_ciphertext_len = key_size;
> +akcipher->max_signature_len = key_size;
> +akcipher->max_dgst_len = key_size;
> +}
> +
> +static int qcrypt_nettle_parse_rsa_private_key(QCryptoNettleRSA *rsa,
> +   const uint8_t *key,
> +   size_t keylen)
> +{
> +QCryptoAkCipherRSAKey *rsa_key =
> +qcrypto_akcipher_parse_rsa_private_key(key, keylen);
> +int ret = -1;
> +if (!rsa_key) {
> +return ret;
> +}
> +
> +nettle_mpz_init_set_str_256_u(rsa->pub.n, rsa_key->n.len, 
> rsa_key->n.data);
> +nettle_mpz_init_set_str_256_u(rsa->pub.e, rsa_key->e.len, 
> rsa_key->e.data);
> +nettle_mpz_init_set_str_256_u(rsa->priv.d, rsa_key->d.len, 
> rsa_key->d.data);
> +nettle_mpz_init_set_str_256_u(rsa->priv.p, rsa_key->p.len, 
> rsa_key->p.data);
> +nettle_mpz_init_set_str_256_u(rsa->priv.q, rsa_key->q.len, 
> rsa_key->q.data);
> +nettle_mpz_init_set_str_256_u(rsa->priv.a, rsa_key->dp.len,
> +  rsa_key->dp.data);
> +nettle_mpz_init_set_str_256_u(rsa->priv.b, rsa_key->dq.len,
> 

Re: [PATCH v4 4/8] crypto: add ASN.1 decoder

2022-04-26 Thread Daniel P . Berrangé
On Mon, Apr 11, 2022 at 06:43:23PM +0800, zhenwei pi wrote:
> From: Lei He 
> 
> Add an ANS.1 decoder which is used to parse asymmetric
> cipher keys
> 
> Signed-off-by: zhenwei pi 
> Signed-off-by: Lei He 
> ---
>  crypto/asn1_decoder.c | 161 ++
>  crypto/asn1_decoder.h |  75 +++
>  crypto/meson.build|   1 +
>  tests/unit/meson.build|   1 +
>  tests/unit/test-crypto-asn1-decoder.c | 289 ++
>  5 files changed, 527 insertions(+)
>  create mode 100644 crypto/asn1_decoder.c
>  create mode 100644 crypto/asn1_decoder.h
>  create mode 100644 tests/unit/test-crypto-asn1-decoder.c
> 
> diff --git a/crypto/asn1_decoder.c b/crypto/asn1_decoder.c
> new file mode 100644
> index 00..506487f713
> --- /dev/null
> +++ b/crypto/asn1_decoder.c

Lets rename this to simply 'der.c' since the DER format is just one
way ASN1 can be encoded, and we only care about DER.

> @@ -0,0 +1,161 @@
> +/*
> + * QEMU Crypto ASN.1 decoder
> + *
> + * Copyright (c) 2022 Bytedance
> + * Author: lei he 
> + *
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * This library is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with this library; if not, see 
> .
> + *
> + */
> +
> +#include 
> +#include 
> +
> +#include "crypto/asn1_decoder.h"
> +
> +enum der_type_tag {

QCryptoDERTypeTag

> +der_type_tag_bool = 0x1,

QCRYPTO_DER_TYPE_TAG_  for each constant

> +der_type_tag_int = 0x2,
> +der_type_tag_bit_str = 0x3,
> +der_type_tag_oct_str = 0x4,
> +der_type_tag_oct_null = 0x5,
> +der_type_tag_oct_oid = 0x6,
> +der_type_tag_seq = 0x10,
> +der_type_tag_set = 0x11,
> +};
> +
> +#define DER_CONSTRUCTED_MASK 0x20
> +#define DER_SHORT_LEN_MASK 0x80

QCRYPTO_DER_ as the name prefix for constants.

> +
> +static uint8_t der_peek_byte(const uint8_t **data, size_t *dlen)

'qcrypto_der_' as the name prefix for all methods


> +{
> +return **data;
> +}
> +
> +static void der_cut_nbytes(const uint8_t **data, size_t *dlen,
> +   size_t nbytes)
> +{
> +*data += nbytes;
> +*dlen -= nbytes;
> +}
> +
> +static uint8_t der_cut_byte(const uint8_t **data, size_t *dlen)
> +{
> +uint8_t val = der_peek_byte(data, dlen);
> +
> +der_cut_nbytes(data, dlen, 1);
> +
> +return val;
> +}
> +
> +static int der_invoke_callback(DERDecodeCb cb, void *ctx,
> +   const uint8_t *value, size_t vlen)

Make sure the 'const uint8_t' is vertically aligned with
the 'DERDecodeCb'

> +{
> +if (!cb) {
> +return 0;
> +}
> +
> +return cb(ctx, value, vlen);
> +}
> +
> +static int der_extract_definite_data(const uint8_t **data, size_t *dlen,
> + DERDecodeCb cb, void *ctx)
> +{
> +const uint8_t *value;
> +size_t vlen = 0;
> +uint8_t byte_count = der_cut_byte(data, dlen);
> +
> +/* short format of definite-length */
> +if (!(byte_count & DER_SHORT_LEN_MASK)) {
> +if (byte_count > *dlen) {
> +return -1;
> +}
> +
> +value = *data;
> +vlen = byte_count;
> +der_cut_nbytes(data, dlen, vlen);
> +
> +if (der_invoke_callback(cb, ctx, value, vlen)) {
> +return -1;
> +}
> +return vlen;
> +}
> +
> +/* Ignore highest bit */
> +byte_count &= ~DER_SHORT_LEN_MASK;
> +
> +/*
> + * size_t is enough to express the length, although the der encoding
> + * standard supports larger length.
> + */
> +if (byte_count > sizeof(size_t)) {
> +return -1;
> +}
> +
> +while (byte_count--) {
> +vlen <<= 8;
> +vlen += der_cut_byte(data, dlen);
> +}
> +
> +if (vlen > *dlen) {
> +return -1;
> +}
> +
> +value = *data;
> +der_cut_nbytes(data, dlen, vlen);
> +
> +if (der_invoke_callback(cb, ctx, value, vlen) != 0) {
> +return -1;
> +}
> +return vlen;
> +}
> +
> +static int der_extract_data(const uint8_t **data, size_t *dlen,
> +DERDecodeCb cb, void *ctx)
> +{
> +uint8_t val = der_peek_byte(data, dlen);
> +
> +/* must use definite length format */
> +if (val == DER_SHORT_LEN_MASK) {
> +return -1;
> +}
> +
> +return der_extract_definite_data(data, dlen, cb, ctx);
> +}
> +
> +int der_decode_int(const uint8_t **data, size_t *dlen,
> + 

Re: [PATCH v4 3/8] crypto: Introduce akcipher crypto class

2022-04-26 Thread Daniel P . Berrangé
On Mon, Apr 11, 2022 at 06:43:22PM +0800, zhenwei pi wrote:
> Support basic asymmetric operations: encrypt, decrypt, sign and
> verify.
> 
> Co-developed-by: lei he 
> Signed-off-by: lei he 
> Signed-off-by: zhenwei pi 
> ---
>  crypto/akcipher.c | 102 +
>  crypto/akcipherpriv.h |  43 +++
>  crypto/meson.build|   1 +
>  include/crypto/akcipher.h | 151 ++
>  4 files changed, 297 insertions(+)
>  create mode 100644 crypto/akcipher.c
>  create mode 100644 crypto/akcipherpriv.h
>  create mode 100644 include/crypto/akcipher.h


> diff --git a/crypto/akcipherpriv.h b/crypto/akcipherpriv.h
> new file mode 100644
> index 00..da9e54a796
> --- /dev/null
> +++ b/crypto/akcipherpriv.h
> @@ -0,0 +1,43 @@
> +/*
> + * QEMU Crypto asymmetric algorithms
> + *
> + * Copyright (c) 2022 Bytedance
> + * Author: zhenwei pi 
> + *
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * This library is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with this library; if not, see 
> .
> + *
> + */
> +
> +#ifndef QCRYPTO_AKCIPHERPRIV_H
> +#define QCRYPTO_AKCIPHERPRIV_H
> +
> +#include "qapi/qapi-types-crypto.h"
> +
> +struct QCryptoAkCipherDriver {
> +int (*encrypt)(QCryptoAkCipher *akcipher,
> +   const void *in, size_t in_len,
> +   void *out, size_t out_len, Error **errp);
> +int (*decrypt)(QCryptoAkCipher *akcipher,
> +   const void *out, size_t out_len,
> +   void *in, size_t in_len, Error **errp);
> +int (*sign)(QCryptoAkCipher *akcipher,
> +const void *in, size_t in_len,
> +void *out, size_t out_len, Error **errp);
> +int (*verify)(QCryptoAkCipher *akcipher,
> +  const void *in, size_t in_len,
> +  const void *in2, size_t in2_len, Error **errp);
> +int (*free)(QCryptoAkCipher *akcipher, Error **errp);
> +};
> +
> +#endif /* QCRYPTO_AKCIPHER_H */


> diff --git a/include/crypto/akcipher.h b/include/crypto/akcipher.h
> new file mode 100644
> index 00..c1970b3b3b
> --- /dev/null
> +++ b/include/crypto/akcipher.h
> @@ -0,0 +1,151 @@
> +/*
> + * QEMU Crypto asymmetric algorithms
> + *
> + * Copyright (c) 2022 Bytedance
> + * Author: zhenwei pi 
> + *
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * This library is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with this library; if not, see 
> .
> + *
> + */
> +
> +#ifndef QCRYPTO_AKCIPHER_H
> +#define QCRYPTO_AKCIPHER_H
> +
> +#include "qapi/qapi-types-crypto.h"
> +
> +typedef struct QCryptoAkCipher QCryptoAkCipher;

This belongs here.

> +typedef struct QCryptoAkCipherDriver QCryptoAkCipherDriver;

This and...

> +
> +struct QCryptoAkCipher {
> +QCryptoAkCipherAlgorithm alg;
> +QCryptoAkCipherKeyType type;
> +int max_plaintext_len;
> +int max_ciphertext_len;
> +int max_signature_len;
> +int max_dgst_len;
> +QCryptoAkCipherDriver *driver;
> +};

...this should be in the akcipherpriv.h file though, since
they're only for internal usage.



> +/**
> + * qcrypto_akcipher_encrypt:
> + * @akcipher: akcipher context
> + * @in: plaintext pending to be encrypted
> + * @in_len: length of the plaintext, MUST less or equal to max_plaintext_len
> + * @out: buffer to store the ciphertext
> + * @out_len: the length of ciphertext buffer, usually equals to
> + *   max_ciphertext_len
> + * @errp: error pointer
> + *
> + * Encrypt data and write ciphertext into out
> + *
> + * Returns: length of ciphertext if encrypt succeed, otherwise -1 is returned
> + */
> +int qcrypto_akcipher_encrypt(QCryptoAkCipher *akcipher,
> + const void *in, size_t in_len,
> + void *out, size_t out_len, Error **errp);
> +
> +/**
> + * qcrypto_akcipher_decrypt:
> + * @akcipher: akcipher context

Re: [PATCH v4 2/8] crypto-akcipher: Introduce akcipher types to qapi

2022-04-26 Thread Daniel P . Berrangé
On Mon, Apr 11, 2022 at 06:43:21PM +0800, zhenwei pi wrote:
> From: Lei He 
> 
> Introduce akcipher types, also include RSA related types.
> 
> Signed-off-by: Lei He 
> Signed-off-by: zhenwei pi 
> ---
>  qapi/crypto.json | 64 
>  1 file changed, 64 insertions(+)

snip

> +##
> +# @QCryptoAkCipherOptions:
> +#
> +# The options that are available for all asymmetric key algorithms
> +# when creating a new QCryptoAkCipher.
> +#
> +# Since: 7.1
> +##
> +{ 'union': 'QCryptoAkCipherOptions',
> +  'base': { 'algorithm': 'QCryptoAkCipherAlgorithm' },
> +  'discriminator': 'algorithm',
> +  'data': { 'rsa': 'QCryptoAkCipherOptionsRSA' }}

I mistakenly suggested 'algorithm' here, but for consistency
with other fields, I should have said just 'alg'.

With that change

  Reviewed-by: Daniel P. Berrangé 


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v2] vduse: Fix NULL pointer dereference on sysfs access

2022-04-26 Thread Greg KH
On Tue, Apr 26, 2022 at 05:41:15PM +0800, Yongji Xie wrote:
> On Tue, Apr 26, 2022 at 4:07 PM Greg KH  wrote:
> >
> > On Tue, Apr 26, 2022 at 03:36:56PM +0800, Xie Yongji wrote:
> > > The control device has no drvdata. So we will get a
> > > NULL pointer dereference when accessing control
> > > device's msg_timeout attribute via sysfs:
> > >
> > > [ 132.841881][ T3644] BUG: kernel NULL pointer dereference, address: 
> > > 00f8
> > > [ 132.850619][ T3644] RIP: 0010:msg_timeout_show 
> > > (drivers/vdpa/vdpa_user/vduse_dev.c:1271)
> > > [ 132.869447][ T3644] dev_attr_show (drivers/base/core.c:2094)
> > > [ 132.870215][ T3644] sysfs_kf_seq_show (fs/sysfs/file.c:59)
> > > [ 132.871164][ T3644] ? device_remove_bin_file (drivers/base/core.c:2088)
> > > [ 132.872082][ T3644] kernfs_seq_show (fs/kernfs/file.c:164)
> > > [ 132.872838][ T3644] seq_read_iter (fs/seq_file.c:230)
> > > [ 132.873578][ T3644] ? __vmalloc_area_node (mm/vmalloc.c:3041)
> > > [ 132.874532][ T3644] kernfs_fop_read_iter (fs/kernfs/file.c:238)
> > > [ 132.875513][ T3644] __kernel_read (fs/read_write.c:440 (discriminator 
> > > 1))
> > > [ 132.876319][ T3644] kernel_read (fs/read_write.c:459)
> > > [ 132.877129][ T3644] kernel_read_file (fs/kernel_read_file.c:94)
> > > [ 132.877978][ T3644] kernel_read_file_from_fd (include/linux/file.h:45 
> > > fs/kernel_read_file.c:186)
> > > [ 132.879019][ T3644] __do_sys_finit_module (kernel/module.c:4207)
> > > [ 132.879930][ T3644] __ia32_sys_finit_module (kernel/module.c:4189)
> > > [ 132.880930][ T3644] do_int80_syscall_32 (arch/x86/entry/common.c:112 
> > > arch/x86/entry/common.c:132)
> > > [ 132.881847][ T3644] entry_INT80_compat 
> > > (arch/x86/entry/entry_64_compat.S:419)
> > >
> > > To fix it, don't create the unneeded attribute for
> > > control device anymore.
> > >
> > > Fixes: c8a6153b6c59 ("vduse: Introduce VDUSE - vDPA Device in Userspace")
> > > Reported-by: kernel test robot 
> > > Cc: sta...@vger.kernel.org
> > > Signed-off-by: Xie Yongji 
> > > ---
> > >  drivers/vdpa/vdpa_user/vduse_dev.c | 7 +++
> > >  1 file changed, 3 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
> > > b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > index f85d1a08ed87..160e40d03084 100644
> > > --- a/drivers/vdpa/vdpa_user/vduse_dev.c
> > > +++ b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > @@ -1344,9 +1344,9 @@ static int vduse_create_dev(struct vduse_dev_config 
> > > *config,
> > >
> > >   dev->minor = ret;
> > >   dev->msg_timeout = VDUSE_MSG_DEFAULT_TIMEOUT;
> > > - dev->dev = device_create(vduse_class, NULL,
> > > -  MKDEV(MAJOR(vduse_major), dev->minor),
> > > -  dev, "%s", config->name);
> > > + dev->dev = device_create_with_groups(vduse_class, NULL,
> > > + MKDEV(MAJOR(vduse_major), dev->minor),
> > > + dev, vduse_dev_groups, "%s", config->name);
> > >   if (IS_ERR(dev->dev)) {
> > >   ret = PTR_ERR(dev->dev);
> > >   goto err_dev;
> > > @@ -1595,7 +1595,6 @@ static int vduse_init(void)
> > >   return PTR_ERR(vduse_class);
> > >
> > >   vduse_class->devnode = vduse_devnode;
> > > - vduse_class->dev_groups = vduse_dev_groups;
> >
> > Ok, this looks much better.
> >
> > But wow, there are some problems in this code overall.  I see a number
> > of flat-out-wrong things in there that should have been caught by code
> > reviews.  Some examples:
> > - empty release() callbacks.  That is a huge sign the code
> >   design is wrong and broken and you are just trying to make the
> >   driver core quiet for some reason.  The documentation in the
> >   kernel explains why this is not ok.
> 
> Sorry, I failed to find the documentation. Do you mean we should
> remove the empty release() callbacks?

Yes, why are they needed?

(hint, retorical question, you added them to remove the driver core
warning when the device is removed, which means someone added them just
because they thought that their code could ignore the hints that the
driver core was telling them.)

Please properly free the memory here.

> > - __module_get(THIS_MODULE);  That's racy, buggy, and doesn't do
> >   what you think it does.  Please never ever ever do that.  It
> >   too is a sign of a broken design.
> 
> I don't find a good way to remove it. We have to make sure the module
> can't be removed until all vduse devices are destroyed.

That will happen automatically when the module is removed.

> And I think __module_get(THIS_MODULE) should be safe in our case since
> we always call it when we have a reference from open().

What happened if someone removed the module _right before_ this was
called?  You can not grab your own reference count safely.

Please just remove it, it's not needed and is broken.  There should not
be any reason that the module can not be unloaded, UNL

Re: [PATCH v9 00/32] virtio pci support VIRTIO_F_RING_RESET (refactor vring)

2022-04-26 Thread Xuan Zhuo
On Tue, 26 Apr 2022 05:55:41 -0400, "Michael S. Tsirkin"  
wrote:
> On Wed, Apr 06, 2022 at 11:43:14AM +0800, Xuan Zhuo wrote:
> > The virtio spec already supports the virtio queue reset function. This 
> > patch set
> > is to add this function to the kernel. The relevant virtio spec information 
> > is
> > here:
> >
> > https://github.com/oasis-tcs/virtio-spec/issues/124
> >
> > Also regarding MMIO support for queue reset, I plan to support it after this
> > patch is passed.
>
> Regarding the spec, there's now an issue proposing
> some changes to the interface. What do you think about that
> proposal? Could you respond on that thread on the virtio TC mailing list?

I have read that thread. And also asked some questions.  I will follow that
thread.

Grateful.

>
>
> > This patch set implements the refactoring of vring. Finally, the
> > virtuque_resize() interface is provided based on the reset function of the
> > transport layer.
> >
> > Test environment:
> > Host: 4.19.91
> > Qemu: QEMU emulator version 6.2.50 (with vq reset support)
> > Test Cmd:  ethtool -G eth1 rx $1 tx $2; ethtool -g eth1
> >
> > The default is split mode, modify Qemu virtio-net to add PACKED feature 
> > to test
> > packed mode.
> >
> > Qemu code:
> > 
> > https://github.com/fengidri/qemu/compare/89f3bfa3265554d1d591ee4d7f1197b6e3397e84...master
> >
> > In order to simplify the review of this patch set, the function of reusing
> > the old buffers after resize will be introduced in subsequent patch sets.
> >
> > Please review. Thanks.
> >
> > v9:
> >   1. Provide a virtqueue_resize() interface directly
> >   2. A patch set including vring resize, virtio pci reset, virtio-net resize
> >   3. No more separate structs
> >
> > v8:
> >   1. Provide a virtqueue_reset() interface directly
> >   2. Split the two patch sets, this is the first part
> >   3. Add independent allocation helper for allocating state, extra
> >
> > v7:
> >   1. fix #6 subject typo
> >   2. fix #6 ring_size_in_bytes is uninitialized
> >   3. check by: make W=12
> >
> > v6:
> >   1. virtio_pci: use synchronize_irq(irq) to sync the irq callbacks
> >   2. Introduce virtqueue_reset_vring() to implement the reset of vring 
> > during
> >  the reset process. May use the old vring if num of the vq not change.
> >   3. find_vqs() support sizes to special the max size of each vq
> >
> > v5:
> >   1. add virtio-net support set_ringparam
> >
> > v4:
> >   1. just the code of virtio, without virtio-net
> >   2. Performing reset on a queue is divided into these steps:
> > 1. reset_vq: reset one vq
> > 2. recycle the buffer from vq by virtqueue_detach_unused_buf()
> > 3. release the ring of the vq by vring_release_virtqueue()
> > 4. enable_reset_vq: re-enable the reset queue
> >   3. Simplify the parameters of enable_reset_vq()
> >   4. add container structures for virtio_pci_common_cfg
> >
> > v3:
> >   1. keep vq, irq unreleased
> >
> > Xuan Zhuo (32):
> >   virtio: add helper virtqueue_get_vring_max_size()
> >   virtio: struct virtio_config_ops add callbacks for queue_reset
> >   virtio_ring: update the document of the virtqueue_detach_unused_buf
> > for queue reset
> >   virtio_ring: remove the arg vq of vring_alloc_desc_extra()
> >   virtio_ring: extract the logic of freeing vring
> >   virtio_ring: split: extract the logic of alloc queue
> >   virtio_ring: split: extract the logic of alloc state and extra
> >   virtio_ring: split: extract the logic of attach vring
> >   virtio_ring: split: extract the logic of vq init
> >   virtio_ring: split: introduce virtqueue_reinit_split()
> >   virtio_ring: split: introduce virtqueue_resize_split()
> >   virtio_ring: packed: extract the logic of alloc queue
> >   virtio_ring: packed: extract the logic of alloc state and extra
> >   virtio_ring: packed: extract the logic of attach vring
> >   virtio_ring: packed: extract the logic of vq init
> >   virtio_ring: packed: introduce virtqueue_reinit_packed()
> >   virtio_ring: packed: introduce virtqueue_resize_packed()
> >   virtio_ring: introduce virtqueue_resize()
> >   virtio_pci: struct virtio_pci_common_cfg add queue_notify_data
> >   virtio: queue_reset: add VIRTIO_F_RING_RESET
> >   virtio_pci: queue_reset: update struct virtio_pci_common_cfg and
> > option functions
> >   virtio_pci: queue_reset: extract the logic of active vq for modern pci
> >   virtio_pci: queue_reset: support VIRTIO_F_RING_RESET
> >   virtio: find_vqs() add arg sizes
> >   virtio_pci: support the arg sizes of find_vqs()
> >   virtio_mmio: support the arg sizes of find_vqs()
> >   virtio: add helper virtio_find_vqs_ctx_size()
> >   virtio_net: set the default max ring size by find_vqs()
> >   virtio_net: get ringparam by virtqueue_get_vring_max_size()
> >   virtio_net: split free_unused_bufs()
> >   virtio_net: support rx/tx queue resize
> >   virtio_net: support set_ringparam
> >
> >  arch/um/drivers/virtio_uml.c |   3 +-
> >  drivers/net/virtio_

Re: [PATCH v9 00/32] virtio pci support VIRTIO_F_RING_RESET (refactor vring)

2022-04-26 Thread Michael S. Tsirkin
On Wed, Apr 06, 2022 at 11:43:14AM +0800, Xuan Zhuo wrote:
> The virtio spec already supports the virtio queue reset function. This patch 
> set
> is to add this function to the kernel. The relevant virtio spec information is
> here:
> 
> https://github.com/oasis-tcs/virtio-spec/issues/124
> 
> Also regarding MMIO support for queue reset, I plan to support it after this
> patch is passed.

Regarding the spec, there's now an issue proposing
some changes to the interface. What do you think about that
proposal? Could you respond on that thread on the virtio TC mailing list?


> This patch set implements the refactoring of vring. Finally, the
> virtuque_resize() interface is provided based on the reset function of the
> transport layer.
> 
> Test environment:
> Host: 4.19.91
> Qemu: QEMU emulator version 6.2.50 (with vq reset support)
> Test Cmd:  ethtool -G eth1 rx $1 tx $2; ethtool -g eth1
> 
> The default is split mode, modify Qemu virtio-net to add PACKED feature 
> to test
> packed mode.
> 
> Qemu code:
> 
> https://github.com/fengidri/qemu/compare/89f3bfa3265554d1d591ee4d7f1197b6e3397e84...master
> 
> In order to simplify the review of this patch set, the function of reusing
> the old buffers after resize will be introduced in subsequent patch sets.
> 
> Please review. Thanks.
> 
> v9:
>   1. Provide a virtqueue_resize() interface directly
>   2. A patch set including vring resize, virtio pci reset, virtio-net resize
>   3. No more separate structs
> 
> v8:
>   1. Provide a virtqueue_reset() interface directly
>   2. Split the two patch sets, this is the first part
>   3. Add independent allocation helper for allocating state, extra
> 
> v7:
>   1. fix #6 subject typo
>   2. fix #6 ring_size_in_bytes is uninitialized
>   3. check by: make W=12
> 
> v6:
>   1. virtio_pci: use synchronize_irq(irq) to sync the irq callbacks
>   2. Introduce virtqueue_reset_vring() to implement the reset of vring during
>  the reset process. May use the old vring if num of the vq not change.
>   3. find_vqs() support sizes to special the max size of each vq
> 
> v5:
>   1. add virtio-net support set_ringparam
> 
> v4:
>   1. just the code of virtio, without virtio-net
>   2. Performing reset on a queue is divided into these steps:
> 1. reset_vq: reset one vq
> 2. recycle the buffer from vq by virtqueue_detach_unused_buf()
> 3. release the ring of the vq by vring_release_virtqueue()
> 4. enable_reset_vq: re-enable the reset queue
>   3. Simplify the parameters of enable_reset_vq()
>   4. add container structures for virtio_pci_common_cfg
> 
> v3:
>   1. keep vq, irq unreleased
> 
> Xuan Zhuo (32):
>   virtio: add helper virtqueue_get_vring_max_size()
>   virtio: struct virtio_config_ops add callbacks for queue_reset
>   virtio_ring: update the document of the virtqueue_detach_unused_buf
> for queue reset
>   virtio_ring: remove the arg vq of vring_alloc_desc_extra()
>   virtio_ring: extract the logic of freeing vring
>   virtio_ring: split: extract the logic of alloc queue
>   virtio_ring: split: extract the logic of alloc state and extra
>   virtio_ring: split: extract the logic of attach vring
>   virtio_ring: split: extract the logic of vq init
>   virtio_ring: split: introduce virtqueue_reinit_split()
>   virtio_ring: split: introduce virtqueue_resize_split()
>   virtio_ring: packed: extract the logic of alloc queue
>   virtio_ring: packed: extract the logic of alloc state and extra
>   virtio_ring: packed: extract the logic of attach vring
>   virtio_ring: packed: extract the logic of vq init
>   virtio_ring: packed: introduce virtqueue_reinit_packed()
>   virtio_ring: packed: introduce virtqueue_resize_packed()
>   virtio_ring: introduce virtqueue_resize()
>   virtio_pci: struct virtio_pci_common_cfg add queue_notify_data
>   virtio: queue_reset: add VIRTIO_F_RING_RESET
>   virtio_pci: queue_reset: update struct virtio_pci_common_cfg and
> option functions
>   virtio_pci: queue_reset: extract the logic of active vq for modern pci
>   virtio_pci: queue_reset: support VIRTIO_F_RING_RESET
>   virtio: find_vqs() add arg sizes
>   virtio_pci: support the arg sizes of find_vqs()
>   virtio_mmio: support the arg sizes of find_vqs()
>   virtio: add helper virtio_find_vqs_ctx_size()
>   virtio_net: set the default max ring size by find_vqs()
>   virtio_net: get ringparam by virtqueue_get_vring_max_size()
>   virtio_net: split free_unused_bufs()
>   virtio_net: support rx/tx queue resize
>   virtio_net: support set_ringparam
> 
>  arch/um/drivers/virtio_uml.c |   3 +-
>  drivers/net/virtio_net.c | 219 +++-
>  drivers/platform/mellanox/mlxbf-tmfifo.c |   3 +
>  drivers/remoteproc/remoteproc_virtio.c   |   3 +
>  drivers/s390/virtio/virtio_ccw.c |   4 +
>  drivers/virtio/virtio_mmio.c |  11 +-
>  drivers/virtio/virtio_pci_common.c   |  28 +-
>  drivers/virtio/virtio_pci_common.h   |   3 +-
>  drivers/virtio/virtio_pci

Re: [PATCH v2] vduse: Fix NULL pointer dereference on sysfs access

2022-04-26 Thread Greg KH
On Tue, Apr 26, 2022 at 03:36:56PM +0800, Xie Yongji wrote:
> The control device has no drvdata. So we will get a
> NULL pointer dereference when accessing control
> device's msg_timeout attribute via sysfs:
> 
> [ 132.841881][ T3644] BUG: kernel NULL pointer dereference, address: 
> 00f8
> [ 132.850619][ T3644] RIP: 0010:msg_timeout_show 
> (drivers/vdpa/vdpa_user/vduse_dev.c:1271)
> [ 132.869447][ T3644] dev_attr_show (drivers/base/core.c:2094)
> [ 132.870215][ T3644] sysfs_kf_seq_show (fs/sysfs/file.c:59)
> [ 132.871164][ T3644] ? device_remove_bin_file (drivers/base/core.c:2088)
> [ 132.872082][ T3644] kernfs_seq_show (fs/kernfs/file.c:164)
> [ 132.872838][ T3644] seq_read_iter (fs/seq_file.c:230)
> [ 132.873578][ T3644] ? __vmalloc_area_node (mm/vmalloc.c:3041)
> [ 132.874532][ T3644] kernfs_fop_read_iter (fs/kernfs/file.c:238)
> [ 132.875513][ T3644] __kernel_read (fs/read_write.c:440 (discriminator 1))
> [ 132.876319][ T3644] kernel_read (fs/read_write.c:459)
> [ 132.877129][ T3644] kernel_read_file (fs/kernel_read_file.c:94)
> [ 132.877978][ T3644] kernel_read_file_from_fd (include/linux/file.h:45 
> fs/kernel_read_file.c:186)
> [ 132.879019][ T3644] __do_sys_finit_module (kernel/module.c:4207)
> [ 132.879930][ T3644] __ia32_sys_finit_module (kernel/module.c:4189)
> [ 132.880930][ T3644] do_int80_syscall_32 (arch/x86/entry/common.c:112 
> arch/x86/entry/common.c:132)
> [ 132.881847][ T3644] entry_INT80_compat 
> (arch/x86/entry/entry_64_compat.S:419)
> 
> To fix it, don't create the unneeded attribute for
> control device anymore.
> 
> Fixes: c8a6153b6c59 ("vduse: Introduce VDUSE - vDPA Device in Userspace")
> Reported-by: kernel test robot 
> Cc: sta...@vger.kernel.org
> Signed-off-by: Xie Yongji 
> ---
>  drivers/vdpa/vdpa_user/vduse_dev.c | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
> b/drivers/vdpa/vdpa_user/vduse_dev.c
> index f85d1a08ed87..160e40d03084 100644
> --- a/drivers/vdpa/vdpa_user/vduse_dev.c
> +++ b/drivers/vdpa/vdpa_user/vduse_dev.c
> @@ -1344,9 +1344,9 @@ static int vduse_create_dev(struct vduse_dev_config 
> *config,
>  
>   dev->minor = ret;
>   dev->msg_timeout = VDUSE_MSG_DEFAULT_TIMEOUT;
> - dev->dev = device_create(vduse_class, NULL,
> -  MKDEV(MAJOR(vduse_major), dev->minor),
> -  dev, "%s", config->name);
> + dev->dev = device_create_with_groups(vduse_class, NULL,
> + MKDEV(MAJOR(vduse_major), dev->minor),
> + dev, vduse_dev_groups, "%s", config->name);
>   if (IS_ERR(dev->dev)) {
>   ret = PTR_ERR(dev->dev);
>   goto err_dev;
> @@ -1595,7 +1595,6 @@ static int vduse_init(void)
>   return PTR_ERR(vduse_class);
>  
>   vduse_class->devnode = vduse_devnode;
> - vduse_class->dev_groups = vduse_dev_groups;

Ok, this looks much better.

But wow, there are some problems in this code overall.  I see a number
of flat-out-wrong things in there that should have been caught by code
reviews.  Some examples:
- empty release() callbacks.  That is a huge sign the code
  design is wrong and broken and you are just trying to make the
  driver core quiet for some reason.  The documentation in the
  kernel explains why this is not ok.
- __module_get(THIS_MODULE);  That's racy, buggy, and doesn't do
  what you think it does.  Please never ever ever do that.  It
  too is a sign of a broken design.
- no Documentation/ABI/ entries for the sysfs files here.  I
  think it's burried in some other documentation file but that's
  not the correct place for it and if you run scripts/get_abi.pl
  with the code loaded it will rightly complain about this.

Do you want to address these, or do you want patches for them?

thanks,

greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization