Re: [PATCH V3 6/9] virtio-ccw: implement synchronize_cbs()

2022-04-27 Thread Jason Wang
On Wed, Apr 27, 2022 at 2:30 PM Michael S. Tsirkin  wrote:
>
> On Wed, Apr 27, 2022 at 11:53:25AM +0800, Jason Wang wrote:
> > On Tue, Apr 26, 2022 at 2:30 PM Michael S. Tsirkin  wrote:
> > >
> > > On Tue, Apr 26, 2022 at 12:07:39PM +0800, Jason Wang wrote:
> > > > On Tue, Apr 26, 2022 at 11:55 AM Michael S. Tsirkin  
> > > > wrote:
> > > > >
> > > > > On Mon, Apr 25, 2022 at 11:53:24PM -0400, Michael S. Tsirkin wrote:
> > > > > > On Tue, Apr 26, 2022 at 11:42:45AM +0800, Jason Wang wrote:
> > > > > > >
> > > > > > > 在 2022/4/26 11:38, Michael S. Tsirkin 写道:
> > > > > > > > On Mon, Apr 25, 2022 at 11:35:41PM -0400, Michael S. Tsirkin 
> > > > > > > > wrote:
> > > > > > > > > On Tue, Apr 26, 2022 at 04:29:11AM +0200, Halil Pasic wrote:
> > > > > > > > > > On Mon, 25 Apr 2022 09:59:55 -0400
> > > > > > > > > > "Michael S. Tsirkin"  wrote:
> > > > > > > > > >
> > > > > > > > > > > On Mon, Apr 25, 2022 at 10:54:24AM +0200, Cornelia Huck 
> > > > > > > > > > > wrote:
> > > > > > > > > > > > On Mon, Apr 25 2022, "Michael S. Tsirkin" 
> > > > > > > > > > > >  wrote:
> > > > > > > > > > > > > On Mon, Apr 25, 2022 at 10:44:15AM +0800, Jason Wang 
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > This patch tries to implement the synchronize_cbs() 
> > > > > > > > > > > > > > for ccw. For the
> > > > > > > > > > > > > > vring_interrupt() that is called via 
> > > > > > > > > > > > > > virtio_airq_handler(), the
> > > > > > > > > > > > > > synchronization is simply done via the airq_info's 
> > > > > > > > > > > > > > lock. For the
> > > > > > > > > > > > > > vring_interrupt() that is called via 
> > > > > > > > > > > > > > virtio_ccw_int_handler(), a per
> > > > > > > > > > > > > > device spinlock for irq is introduced ans used in 
> > > > > > > > > > > > > > the synchronization
> > > > > > > > > > > > > > method.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Cc: Thomas Gleixner 
> > > > > > > > > > > > > > Cc: Peter Zijlstra 
> > > > > > > > > > > > > > Cc: "Paul E. McKenney" 
> > > > > > > > > > > > > > Cc: Marc Zyngier 
> > > > > > > > > > > > > > Cc: Halil Pasic 
> > > > > > > > > > > > > > Cc: Cornelia Huck 
> > > > > > > > > > > > > > Signed-off-by: Jason Wang 
> > > > > > > > > > > > >
> > > > > > > > > > > > > This is the only one that is giving me pause. Halil, 
> > > > > > > > > > > > > Cornelia,
> > > > > > > > > > > > > should we be concerned about the performance impact 
> > > > > > > > > > > > > here?
> > > > > > > > > > > > > Any chance it can be tested?
> > > > > > > > > > > > We can have a bunch of devices using the same airq 
> > > > > > > > > > > > structure, and the
> > > > > > > > > > > > sync cb creates a choke point, same as 
> > > > > > > > > > > > registering/unregistering.
> > > > > > > > > > > BTW can callbacks for multiple VQs run on multiple CPUs 
> > > > > > > > > > > at the moment?
> > > > > > > > > > I'm not sure I understand the question.
> > > > > > > > > >
> > > > > > > > > > I do think we can have multiple CPUs that are executing 
> > > > > > > > > > some portion of
> > > > > > > > > > virtio_ccw_int_handler(). So I guess the answer is yes. 
> > > > > > > > > > Connie what do you think?
> > > > > > > > > >
> > > > > > > > > > On the other hand we could also end up serializing 
> > > > > > > > > > synchronize_cbs()
> > > > > > > > > > calls for different devices if they happen to use the same 
> > > > > > > > > > airq_info. But
> > > > > > > > > > this probably was not your question
> > > > > > > > >
> > > > > > > > > I am less concerned about  synchronize_cbs being slow and 
> > > > > > > > > more about
> > > > > > > > > the slowdown in interrupt processing itself.
> > > > > > > > >
> > > > > > > > > > > this patch serializes them on a spinlock.
> > > > > > > > > > >
> > > > > > > > > > Those could then pile up on the newly introduced spinlock.
> > > > > > > > > >
> > > > > > > > > > Regards,
> > > > > > > > > > Halil
> > > > > > > > > Hmm yea ... not good.
> > > > > > > > Is there any other way to synchronize with all callbacks?
> > > > > > >
> > > > > > >
> > > > > > > Maybe using rwlock as airq handler?
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > >
> > > > > > rwlock is still a shared cacheline bouncing between CPUs and
> > > > > > a bunch of ordering instructions.
> > > >
> > > > Yes, but it should be faster than spinlocks anyhow.
> > > >
> > > > > > Maybe something per-cpu + some IPIs to run things on all CPUs 
> > > > > > instead?
> > > >
> > > > Is this something like a customized version of 
> > > > synchronzie_rcu_expedited()?
> > >
> > > With interrupts running in an RCU read size critical section?
> >
> > For vring_interrupt(), yes.
> >
> >
> > > Quite possibly that is also an option.
> > > This will need a bunch of documentation since this is not
> > > a standard use of RCU,
> >
> > According to Documentation/RCU/requirements.rst, it looks like a legal case:
> >
> > "
> > The Linux kernel has interrupts, and RCU read-side critical se

Re: [PATCH V3 6/9] virtio-ccw: implement synchronize_cbs()

2022-04-27 Thread Cornelia Huck
On Tue, Apr 26 2022, "Michael S. Tsirkin"  wrote:

> On Tue, Apr 26, 2022 at 05:47:17PM +0200, Cornelia Huck wrote:
>> On Mon, Apr 25 2022, "Michael S. Tsirkin"  wrote:
>> 
>> > On Mon, Apr 25, 2022 at 11:53:24PM -0400, Michael S. Tsirkin wrote:
>> >> On Tue, Apr 26, 2022 at 11:42:45AM +0800, Jason Wang wrote:
>> >> > 
>> >> > 在 2022/4/26 11:38, Michael S. Tsirkin 写道:
>> >> > > On Mon, Apr 25, 2022 at 11:35:41PM -0400, Michael S. Tsirkin wrote:
>> >> > > > On Tue, Apr 26, 2022 at 04:29:11AM +0200, Halil Pasic wrote:
>> >> > > > > On Mon, 25 Apr 2022 09:59:55 -0400
>> >> > > > > "Michael S. Tsirkin"  wrote:
>> >> > > > > 
>> >> > > > > > On Mon, Apr 25, 2022 at 10:54:24AM +0200, Cornelia Huck wrote:
>> >> > > > > > > On Mon, Apr 25 2022, "Michael S. Tsirkin"  
>> >> > > > > > > wrote:
>> >> > > > > > > > On Mon, Apr 25, 2022 at 10:44:15AM +0800, Jason Wang wrote:
>> >> > > > > > > > > This patch tries to implement the synchronize_cbs() for 
>> >> > > > > > > > > ccw. For the
>> >> > > > > > > > > vring_interrupt() that is called via 
>> >> > > > > > > > > virtio_airq_handler(), the
>> >> > > > > > > > > synchronization is simply done via the airq_info's lock. 
>> >> > > > > > > > > For the
>> >> > > > > > > > > vring_interrupt() that is called via 
>> >> > > > > > > > > virtio_ccw_int_handler(), a per
>> >> > > > > > > > > device spinlock for irq is introduced ans used in the 
>> >> > > > > > > > > synchronization
>> >> > > > > > > > > method.
>> >> > > > > > > > > 
>> >> > > > > > > > > Cc: Thomas Gleixner 
>> >> > > > > > > > > Cc: Peter Zijlstra 
>> >> > > > > > > > > Cc: "Paul E. McKenney" 
>> >> > > > > > > > > Cc: Marc Zyngier 
>> >> > > > > > > > > Cc: Halil Pasic 
>> >> > > > > > > > > Cc: Cornelia Huck 
>> >> > > > > > > > > Signed-off-by: Jason Wang 
>> >> > > > > > > > 
>> >> > > > > > > > This is the only one that is giving me pause. Halil, 
>> >> > > > > > > > Cornelia,
>> >> > > > > > > > should we be concerned about the performance impact here?
>> >> > > > > > > > Any chance it can be tested?
>> >> > > > > > > We can have a bunch of devices using the same airq structure, 
>> >> > > > > > > and the
>> >> > > > > > > sync cb creates a choke point, same as 
>> >> > > > > > > registering/unregistering.
>> >> > > > > > BTW can callbacks for multiple VQs run on multiple CPUs at the 
>> >> > > > > > moment?
>> >> > > > > I'm not sure I understand the question.
>> >> > > > > 
>> >> > > > > I do think we can have multiple CPUs that are executing some 
>> >> > > > > portion of
>> >> > > > > virtio_ccw_int_handler(). So I guess the answer is yes. Connie 
>> >> > > > > what do you think?
>> >> > > > > 
>> >> > > > > On the other hand we could also end up serializing 
>> >> > > > > synchronize_cbs()
>> >> > > > > calls for different devices if they happen to use the same 
>> >> > > > > airq_info. But
>> >> > > > > this probably was not your question
>> >> > > > 
>> >> > > > I am less concerned about  synchronize_cbs being slow and more about
>> >> > > > the slowdown in interrupt processing itself.
>> >> > > > 
>> >> > > > > > this patch serializes them on a spinlock.
>> >> > > > > > 
>> >> > > > > Those could then pile up on the newly introduced spinlock.
>> 
>> How bad would that be in practice? IIUC, we hit on the spinlock when
>> - doing synchronize_cbs (should be rare)
>> - processing queue interrupts for devices using per-device indicators
>>   (which is the non-preferred path, which I would basically only expect
>>   when running on an ancient or non-standard hypervisor)
>
> this one is my concern. I am worried serializing everything on a single lock
> will drastically regress performance here.

Yeah, that case could get much worse. OTOH, how likely is it that any
setup that runs a recent kernel will actually end up with devices using
per-device indicators? Anything running under a QEMU released in the
last couple of years is unlikely to not use airqs, I think. Halil, do
you think that the classic indicator setup would be more common on any
non-QEMU hypervisors?

IOW, how much effort is it worth spending on optimizing this case? We
certainly should explore any simple solutions, but I don't think we need
to twist ourselves into pretzels to solve it.

>
>
>> - configuration change interrupts (should be rare)
>> - during setup, reset, etc. (should not be a concern)

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 2/2] virtio: replace arch_has_restricted_virtio_memory_access()

2022-04-27 Thread Juergen Gross via Virtualization

On 27.04.22 14:28, Borislav Petkov wrote:

On Wed, Apr 27, 2022 at 08:37:31AM +0200, Juergen Gross wrote:

On 26.04.22 19:35, Borislav Petkov wrote:

On Tue, Apr 26, 2022 at 03:40:21PM +0200, Juergen Gross wrote:

   /* protected virtualization */
   static void pv_init(void)
   {
if (!is_prot_virt_guest())
return;
+   platform_set_feature(PLATFORM_VIRTIO_RESTRICTED_MEM_ACCESS);


Kinda long-ish for my taste. I'll probably call it:

platform_set()

as it is implicit that it sets a feature bit.


Okay, fine with me.




diff --git a/arch/x86/mm/mem_encrypt_identity.c 
b/arch/x86/mm/mem_encrypt_identity.c
index b43bc24d2bb6..6043ba6cd17d 100644
--- a/arch/x86/mm/mem_encrypt_identity.c
+++ b/arch/x86/mm/mem_encrypt_identity.c
@@ -40,6 +40,7 @@
   #include 
   #include 
   #include 
+#include 
   #include 
   #include 
@@ -566,6 +567,10 @@ void __init sme_enable(struct boot_params *bp)
} else {
/* SEV state cannot be controlled by a command line option */
sme_me_mask = me_mask;
+
+   /* Set restricted memory access for virtio. */
+   platform_set_feature(PLATFORM_VIRTIO_RESTRICTED_MEM_ACCESS);


Huh, what does that have to do with SME?


I picked the function where sev_status is being set, as this seemed to be
the correct place to set the feature bit.


What I don't understand is what does restricted memory access have to do
with AMD SEV and how does play together with what you guys are trying to
do?

The big picture pls.


Ah, okay.

For support of virtio with Xen we want to not only support the virtio
devices like KVM, but use grants for letting the guest decide which
pages are allowed to be mapped by the backend (dom0).

Instead of physical guest addresses the guest will use grant-ids (plus
offset). In order to be able to handle this at the basic virtio level
instead of the single virtio device drivers, we need to use dedicated
dma-ops. And those will be used by virtio only, if the "restricted
virtio memory request" flag is set, which is used by SEV, too. In order
to let virtio set this flag, we need a way to communicate to virtio
that the running system is either a SEV guest or a Xen guest.

HTH,


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 2/2] virtio: replace arch_has_restricted_virtio_memory_access()

2022-04-27 Thread Juergen Gross via Virtualization

On 27.04.22 14:26, Borislav Petkov wrote:

On Wed, Apr 27, 2022 at 08:40:08AM +0200, Juergen Gross wrote:

I was planning to look at the x86 cpu features to see whether some of
those might be candidates to be switched to platform features instead.


I'd say "never touch a running system" unless the platform features are
of an advantage...


Depends on the use case IMHO.

All features being based on a cpuid bit are no candidates. Same applies
to all features used for alternative handling (assuming we don't want
to expand that to platform features).

I really have no idea whether this will leave any candidates. In case
it does, it might be interesting to switch those in order to save some
per-cpu bits.

Other candidates might be the x86_legacy_features.


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v2 2/5] hv_sock: Copy packets sent by Hyper-V out of the ring buffer

2022-04-27 Thread Stefano Garzarella

On Wed, Apr 27, 2022 at 03:12:22PM +0200, Andrea Parri (Microsoft) wrote:

Pointers to VMbus packets sent by Hyper-V are used by the hv_sock driver
within the guest VM.  Hyper-V can send packets with erroneous values or
modify packet fields after they are processed by the guest.  To defend
against these scenarios, copy the incoming packet after validating its
length and offset fields using hv_pkt_iter_{first,next}().  Use
HVS_PKT_LEN(HVS_MTU_SIZE) to initialize the buffer which holds the
copies of the incoming packets.  In this way, the packet can no longer
be modified by the host.

Signed-off-by: Andrea Parri (Microsoft) 
Reviewed-by: Michael Kelley 
---
net/vmw_vsock/hyperv_transport.c | 9 +++--
1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/net/vmw_vsock/hyperv_transport.c b/net/vmw_vsock/hyperv_transport.c
index 943352530936e..8c37d07017fc4 100644
--- a/net/vmw_vsock/hyperv_transport.c
+++ b/net/vmw_vsock/hyperv_transport.c
@@ -78,6 +78,9 @@ struct hvs_send_buf {
 ALIGN((payload_len), 8) + \
 VMBUS_PKT_TRAILER_SIZE)

+/* Upper bound on the size of a VMbus packet for hv_sock */
+#define HVS_MAX_PKT_SIZE   HVS_PKT_LEN(HVS_MTU_SIZE)
+
union hvs_service_id {
guid_t  srv_id;

@@ -378,6 +381,8 @@ static void hvs_open_connection(struct vmbus_channel *chan)
rcvbuf = ALIGN(rcvbuf, HV_HYP_PAGE_SIZE);
}

+   chan->max_pkt_size = HVS_MAX_PKT_SIZE;
+
ret = vmbus_open(chan, sndbuf, rcvbuf, NULL, 0, hvs_channel_cb,
 conn_from_host ? new : sk);
if (ret != 0) {
@@ -602,7 +607,7 @@ static ssize_t hvs_stream_dequeue(struct vsock_sock *vsk, 
struct msghdr *msg,
return -EOPNOTSUPP;

if (need_refill) {
-   hvs->recv_desc = hv_pkt_iter_first_raw(hvs->chan);
+   hvs->recv_desc = hv_pkt_iter_first(hvs->chan);
if (!hvs->recv_desc)
return -ENOBUFS;
ret = hvs_update_recv_data(hvs);
@@ -618,7 +623,7 @@ static ssize_t hvs_stream_dequeue(struct vsock_sock *vsk, 
struct msghdr *msg,

hvs->recv_data_len -= to_read;
if (hvs->recv_data_len == 0) {
-   hvs->recv_desc = hv_pkt_iter_next_raw(hvs->chan, 
hvs->recv_desc);
+   hvs->recv_desc = hv_pkt_iter_next(hvs->chan, hvs->recv_desc);
if (hvs->recv_desc) {
ret = hvs_update_recv_data(hvs);
if (ret)
--
2.25.1



Reviewed-by: Stefano Garzarella 

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 3/5] hv_sock: Add validation for untrusted Hyper-V values

2022-04-27 Thread Stefano Garzarella

On Wed, Apr 27, 2022 at 03:12:23PM +0200, Andrea Parri (Microsoft) wrote:

For additional robustness in the face of Hyper-V errors or malicious
behavior, validate all values that originate from packets that Hyper-V
has sent to the guest in the host-to-guest ring buffer.  Ensure that
invalid values cannot cause data being copied out of the bounds of the
source buffer in hvs_stream_dequeue().

Signed-off-by: Andrea Parri (Microsoft) 
Reviewed-by: Michael Kelley 
---
include/linux/hyperv.h   |  5 +
net/vmw_vsock/hyperv_transport.c | 10 --
2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index fe2e0179ed51e..55478a6810b60 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -1663,6 +1663,11 @@ static inline u32 hv_pkt_datalen(const struct 
vmpacket_descriptor *desc)
return (desc->len8 << 3) - (desc->offset8 << 3);
}

+/* Get packet length associated with descriptor */
+static inline u32 hv_pkt_len(const struct vmpacket_descriptor *desc)
+{
+   return desc->len8 << 3;
+}

struct vmpacket_descriptor *
hv_pkt_iter_first_raw(struct vmbus_channel *channel);
diff --git a/net/vmw_vsock/hyperv_transport.c b/net/vmw_vsock/hyperv_transport.c
index 8c37d07017fc4..fd98229e3db30 100644
--- a/net/vmw_vsock/hyperv_transport.c
+++ b/net/vmw_vsock/hyperv_transport.c
@@ -577,12 +577,18 @@ static bool hvs_dgram_allow(u32 cid, u32 port)
static int hvs_update_recv_data(struct hvsock *hvs)
{
struct hvs_recv_buf *recv_buf;
-   u32 payload_len;
+   u32 pkt_len, payload_len;
+
+   pkt_len = hv_pkt_len(hvs->recv_desc);
+
+   if (pkt_len < HVS_HEADER_LEN)
+   return -EIO;

recv_buf = (struct hvs_recv_buf *)(hvs->recv_desc + 1);
payload_len = recv_buf->hdr.data_size;

-   if (payload_len > HVS_MTU_SIZE)
+   if (payload_len > pkt_len - HVS_HEADER_LEN ||
+   payload_len > HVS_MTU_SIZE)
return -EIO;

if (payload_len == 0)
--
2.25.1



Reviewed-by: Stefano Garzarella 

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 2/2] virtio: replace arch_has_restricted_virtio_memory_access()

2022-04-27 Thread Tom Lendacky via Virtualization

On 4/27/22 07:37, Juergen Gross wrote:

On 27.04.22 14:28, Borislav Petkov wrote:

On Wed, Apr 27, 2022 at 08:37:31AM +0200, Juergen Gross wrote:

On 26.04.22 19:35, Borislav Petkov wrote:

On Tue, Apr 26, 2022 at 03:40:21PM +0200, Juergen Gross wrote:

   /* protected virtualization */
   static void pv_init(void)
   {
   if (!is_prot_virt_guest())
   return;
+    platform_set_feature(PLATFORM_VIRTIO_RESTRICTED_MEM_ACCESS);


Kinda long-ish for my taste. I'll probably call it:

platform_set()

as it is implicit that it sets a feature bit.


Okay, fine with me.



diff --git a/arch/x86/mm/mem_encrypt_identity.c 
b/arch/x86/mm/mem_encrypt_identity.c

index b43bc24d2bb6..6043ba6cd17d 100644
--- a/arch/x86/mm/mem_encrypt_identity.c
+++ b/arch/x86/mm/mem_encrypt_identity.c
@@ -40,6 +40,7 @@
   #include 
   #include 
   #include 
+#include 
   #include 
   #include 
@@ -566,6 +567,10 @@ void __init sme_enable(struct boot_params *bp)
   } else {
   /* SEV state cannot be controlled by a command line option */
   sme_me_mask = me_mask;
+
+    /* Set restricted memory access for virtio. */
+    platform_set_feature(PLATFORM_VIRTIO_RESTRICTED_MEM_ACCESS);


This is way early in the boot, but it appears that marking the platform 
feature bitmap as __read_mostly puts this in the .data section, so avoids 
the issue of bss being cleared.


TDX support also uses the arch_has_restricted_virtio_memory_access() 
function and will need to be updated.


Seems like a lot of changes, I just wonder if the the arch_has...() 
function couldn't be updated to also include a Xen check?


Thanks,
Tom



Huh, what does that have to do with SME?


I picked the function where sev_status is being set, as this seemed to be
the correct place to set the feature bit.


What I don't understand is what does restricted memory access have to do
with AMD SEV and how does play together with what you guys are trying to
do?

The big picture pls.


Ah, okay.

For support of virtio with Xen we want to not only support the virtio
devices like KVM, but use grants for letting the guest decide which
pages are allowed to be mapped by the backend (dom0).

Instead of physical guest addresses the guest will use grant-ids (plus
offset). In order to be able to handle this at the basic virtio level
instead of the single virtio device drivers, we need to use dedicated
dma-ops. And those will be used by virtio only, if the "restricted
virtio memory request" flag is set, which is used by SEV, too. In order
to let virtio set this flag, we need a way to communicate to virtio
that the running system is either a SEV guest or a Xen guest.

HTH,


Juergen

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] drm/bochs: Explicitly include linux/module.h

2022-04-27 Thread Daniel Vetter
On Wed, Apr 13, 2022 at 06:12:59PM +0200, Michel Dänzer wrote:
> From: Michel Dänzer 
> 
> Instead of relying on it getting pulled in indirectly.
> 
> Signed-off-by: Michel Dänzer 

Reviewed-by: Daniel Vetter 

> ---
>  drivers/gpu/drm/tiny/bochs.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/tiny/bochs.c b/drivers/gpu/drm/tiny/bochs.c
> index ed971c8bb446..4f8bf86633df 100644
> --- a/drivers/gpu/drm/tiny/bochs.c
> +++ b/drivers/gpu/drm/tiny/bochs.c
> @@ -1,5 +1,6 @@
>  // SPDX-License-Identifier: GPL-2.0-or-later
>  
> +#include 
>  #include 
>  
>  #include 
> -- 
> 2.35.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 2/2] virtio: replace arch_has_restricted_virtio_memory_access()

2022-04-27 Thread Juergen Gross via Virtualization

On 27.04.22 16:09, Tom Lendacky wrote:

On 4/27/22 07:37, Juergen Gross wrote:

On 27.04.22 14:28, Borislav Petkov wrote:

On Wed, Apr 27, 2022 at 08:37:31AM +0200, Juergen Gross wrote:

On 26.04.22 19:35, Borislav Petkov wrote:

On Tue, Apr 26, 2022 at 03:40:21PM +0200, Juergen Gross wrote:

   /* protected virtualization */
   static void pv_init(void)
   {
   if (!is_prot_virt_guest())
   return;
+    platform_set_feature(PLATFORM_VIRTIO_RESTRICTED_MEM_ACCESS);


Kinda long-ish for my taste. I'll probably call it:

platform_set()

as it is implicit that it sets a feature bit.


Okay, fine with me.



diff --git a/arch/x86/mm/mem_encrypt_identity.c 
b/arch/x86/mm/mem_encrypt_identity.c

index b43bc24d2bb6..6043ba6cd17d 100644
--- a/arch/x86/mm/mem_encrypt_identity.c
+++ b/arch/x86/mm/mem_encrypt_identity.c
@@ -40,6 +40,7 @@
   #include 
   #include 
   #include 
+#include 
   #include 
   #include 
@@ -566,6 +567,10 @@ void __init sme_enable(struct boot_params *bp)
   } else {
   /* SEV state cannot be controlled by a command line option */
   sme_me_mask = me_mask;
+
+    /* Set restricted memory access for virtio. */
+    platform_set_feature(PLATFORM_VIRTIO_RESTRICTED_MEM_ACCESS);


This is way early in the boot, but it appears that marking the platform feature 
bitmap as __read_mostly puts this in the .data section, so avoids the issue of 
bss being cleared.


In V2 (not yet posted) I have moved the call to sev_setup_arch().



TDX support also uses the arch_has_restricted_virtio_memory_access() function 
and will need to be updated.


Yes.

Seems like a lot of changes, I just wonder if the the arch_has...() function 
couldn't be updated to also include a Xen check?


This was not seen to be a nice solution.

And TBH, I think this series is making the code much cleaner. Look at the
diffstat of this patch.


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v4 10/15] drm/shmem-helper: Take reservation lock instead of drm_gem_shmem locks

2022-04-27 Thread Daniel Vetter
On Mon, Apr 18, 2022 at 10:18:54PM +0300, Dmitry Osipenko wrote:
> Hello,
> 
> On 4/18/22 21:38, Thomas Zimmermann wrote:
> > Hi
> > 
> > Am 18.04.22 um 00:37 schrieb Dmitry Osipenko:
> >> Replace drm_gem_shmem locks with the reservation lock to make GEM
> >> lockings more consistent.
> >>
> >> Previously drm_gem_shmem_vmap() and drm_gem_shmem_get_pages() were
> >> protected by separate locks, now it's the same lock, but it doesn't
> >> make any difference for the current GEM SHMEM users. Only Panfrost
> >> and Lima drivers use vmap() and they do it in the slow code paths,
> >> hence there was no practical justification for the usage of separate
> >> lock in the vmap().
> >>
> >> Suggested-by: Daniel Vetter 
> >> Signed-off-by: Dmitry Osipenko 
> >> ---
> ...
> >>   @@ -310,7 +306,7 @@ static int drm_gem_shmem_vmap_locked(struct
> >> drm_gem_shmem_object *shmem,
> >>   } else {
> >>   pgprot_t prot = PAGE_KERNEL;
> >>   -    ret = drm_gem_shmem_get_pages(shmem);
> >> +    ret = drm_gem_shmem_get_pages_locked(shmem);
> >>   if (ret)
> >>   goto err_zero_use;
> >>   @@ -360,11 +356,11 @@ int drm_gem_shmem_vmap(struct
> >> drm_gem_shmem_object *shmem,
> >>   {
> >>   int ret;
> >>   -    ret = mutex_lock_interruptible(&shmem->vmap_lock);
> >> +    ret = dma_resv_lock_interruptible(shmem->base.resv, NULL);
> >>   if (ret)
> >>   return ret;
> >>   ret = drm_gem_shmem_vmap_locked(shmem, map);
> > 
> > Within drm_gem_shmem_vmap_locked(), there's a call to dma_buf_vmap() for
> > imported pages. If the exporter side also holds/acquires the same
> > reservation lock as our object, the whole thing can deadlock. We cannot
> > move dma_buf_vmap() out of the CS, because we still need to increment
> > the reference counter. I honestly don't know how to easily fix this
> > problem. There's a TODO item about replacing these locks at [1]. As
> > Daniel suggested this patch, we should talk to him about the issue.
> > 
> > Best regards
> > Thomas
> > 
> > [1]
> > https://www.kernel.org/doc/html/latest/gpu/todo.html#move-buffer-object-locking-to-dma-resv-lock
> 
> Indeed, good catch! Perhaps we could simply use a separate lock for the
> vmapping of the *imported* GEMs? The vmap_use_count is used only by
> vmap/vunmap, so it doesn't matter which lock is used by these functions
> in the case of imported GEMs since we only need to protect the
> vmap_use_count.

Apologies for the late reply, I'm flooded.

I discussed this with Daniel Stone last week in a chat, roughly what we
need to do is:

1. Pick a function from shmem helpers.

2. Go through all drivers that call this, and make sure that we acquire
dma_resv_lock in the top level driver entry point for this.

3. Once all driver code paths are converted, add a dma_resv_assert_held()
call to that function to make sure you have it all correctly.

4. Repeate 1-3 until all shmem helper functions are converted over.

5. Ditch the 3 different shmem helper locks.

The trouble is that I forgot that vmap is a thing, so that needs more
work. I think there's two approaches here:
- Do the vmap at import time. This is the trick we used to untangle the
  dma_resv_lock issues around dma_buf_attachment_map()
- Change the dma_buf_vmap rules that callers must hold the dma_resv_lock.
- Maybe also do what you suggest and keep a separate lock for this, but
  the fundamental issue is that this doesn't really work - if you share
  buffers both ways with two drivers using shmem helpers, then the
  ordering of this vmap_count_mutex vs dma_resv_lock is inconsistent and
  you can get some nice deadlocks. So not a great approach (and also the
  reason why we really need to get everyone to move towards dma_resv_lock
  as _the_ buffer object lock, since otherwise we'll never get a
  consistent lock nesting hierarchy).

The trouble here is that trying to be clever and doing the conversion just
in shmem helpers wont work, because there's a lot of cases where the
drivers are all kinds of inconsistent with their locking.

Adding Daniel S, also maybe for questions it'd be fastest to chat on irc?
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v4 11/15] drm/shmem-helper: Add generic memory shrinker

2022-04-27 Thread Daniel Vetter
On Tue, Apr 19, 2022 at 11:40:41PM +0300, Dmitry Osipenko wrote:
> On 4/19/22 10:22, Thomas Zimmermann wrote:
> > Hi
> > 
> > Am 18.04.22 um 00:37 schrieb Dmitry Osipenko:
> >> Introduce a common DRM SHMEM shrinker. It allows to reduce code
> >> duplication among DRM drivers that implement theirs own shrinkers.
> >> This is initial version of the shrinker that covers basic needs of
> >> GPU drivers, both purging and eviction of shmem objects are supported.
> >>
> >> This patch is based on a couple ideas borrowed from Rob's Clark MSM
> >> shrinker and Thomas' Zimmermann variant of SHMEM shrinker.
> >>
> >> In order to start using DRM SHMEM shrinker drivers should:
> >>
> >> 1. Implement new purge(), evict() + swap_in() GEM callbacks.
> >> 2. Register shrinker using drm_gem_shmem_shrinker_register(drm_device).
> >> 3. Use drm_gem_shmem_set_purgeable_and_evictable(shmem) and alike API
> >>     functions to activate shrinking of GEMs.
> >>
> >> Signed-off-by: Daniel Almeida 
> >> Signed-off-by: Dmitry Osipenko 
> >> ---
> >>   drivers/gpu/drm/drm_gem_shmem_helper.c | 765 -
> >>   include/drm/drm_device.h   |   4 +
> >>   include/drm/drm_gem.h  |  35 ++
> >>   include/drm/drm_gem_shmem_helper.h | 105 +++-
> >>   4 files changed, 877 insertions(+), 32 deletions(-)
> ...
> >> @@ -172,6 +172,41 @@ struct drm_gem_object_funcs {
> >>    * This is optional but necessary for mmap support.
> >>    */
> >>   const struct vm_operations_struct *vm_ops;
> >> +
> >> +    /**
> >> + * @purge:
> >> + *
> >> + * Releases the GEM object's allocated backing storage to the
> >> system.
> >> + *
> >> + * Returns the number of pages that have been freed by purging
> >> the GEM object.
> >> + *
> >> + * This callback is used by the GEM shrinker.
> >> + */
> >> +    unsigned long (*purge)(struct drm_gem_object *obj);

Hm I feel like drivers shouldn't need to know the difference here?

Like shmem helpers can track what's purgeable, and for eviction/purging
the driver callback should do the same?

The only difference is when we try to re-reserve the backing storage. When
the object has been evicted that should suceed, but when the object is
purged that will fail.

That's the difference between evict and purge for drivers?

> >> +
> >> +    /**
> >> + * @evict:
> >> + *
> >> + * Unpins the GEM object's allocated backing storage, allowing
> >> shmem pages
> >> + * to be swapped out.
> > 
> > What's the difference to the existing unpin() callback?
> 
> Drivers need to do more than just unpinning pages when GEMs are evicted.
> Unpinning is only a part of the eviction process. I'll improve the
> doc-comment in v5.
> 
> For example, for VirtIO-GPU driver we need to to detach host from the
> guest's memory before pages are evicted [1].
> 
> [1]
> https://gitlab.collabora.com/dmitry.osipenko/linux-kernel-rd/-/blob/932eb03198bce3a21353b09ab71e95f1c19b84c2/drivers/gpu/drm/virtio/virtgpu_object.c#L145
> 
> In case of Panfrost driver, we will need to remove mappings before pages
> are evicted.

It might be good to align this with ttm, otoh that all works quite a bit
differently for ttm since ttm supports buffer moves and a lot more fancy
stuff.

I'm bringing this up since I have this fancy idea that eventually we could
glue shmem helpers into ttm in some cases for managing buffers when they
sit in system memory (as opposed to vram).

> >> + *
> >> + * Returns the number of pages that have been unpinned.
> >> + *
> >> + * This callback is used by the GEM shrinker.
> >> + */
> >> +    unsigned long (*evict)(struct drm_gem_object *obj);
> >> +
> >> +    /**
> >> + * @swap_in:
> >> + *
> >> + * Pins GEM object's allocated backing storage if it was
> >> previously evicted,
> >> + * moving swapped out pages back to memory.
> >> + *
> >> + * Returns 0 on success, or -errno on error.
> >> + *
> >> + * This callback is used by the GEM shrinker.
> >> + */
> >> +    int (*swap_in)(struct drm_gem_object *obj);
> > 
> > Why do you need swap_in()? This can be done on-demand as part of a pin
> > or vmap operation.
> 
> Similarly to the unpinning, the pining of pages is only a part of what
> needs to be done for GPU drivers. Besides of returning pages back to
> memory, we also need to make them accessible to GPU and this is a
> driver-specific process. This why we need the additional callbacks.

This is a bit much midlayer. The way this works in ttm is you reserve all
the objects you need (which makes sure they're physically available
again), and then the driver goes through and makes sure the page tables
are all set up again.

Once you get towards gpu vm that's really the only approach, since your
swap_in has no idea for which vm it needs to restore pagetables (and
restoring it for all is a bit meh).

If drivers want to optimize this they can adjust/set any tracking
information from their evic

Re: [PATCH V3 6/9] virtio-ccw: implement synchronize_cbs()

2022-04-27 Thread Michael S. Tsirkin
On Wed, Apr 27, 2022 at 03:57:57PM +0800, Jason Wang wrote:
> On Wed, Apr 27, 2022 at 2:30 PM Michael S. Tsirkin  wrote:
> >
> > On Wed, Apr 27, 2022 at 11:53:25AM +0800, Jason Wang wrote:
> > > On Tue, Apr 26, 2022 at 2:30 PM Michael S. Tsirkin  
> > > wrote:
> > > >
> > > > On Tue, Apr 26, 2022 at 12:07:39PM +0800, Jason Wang wrote:
> > > > > On Tue, Apr 26, 2022 at 11:55 AM Michael S. Tsirkin  
> > > > > wrote:
> > > > > >
> > > > > > On Mon, Apr 25, 2022 at 11:53:24PM -0400, Michael S. Tsirkin wrote:
> > > > > > > On Tue, Apr 26, 2022 at 11:42:45AM +0800, Jason Wang wrote:
> > > > > > > >
> > > > > > > > 在 2022/4/26 11:38, Michael S. Tsirkin 写道:
> > > > > > > > > On Mon, Apr 25, 2022 at 11:35:41PM -0400, Michael S. Tsirkin 
> > > > > > > > > wrote:
> > > > > > > > > > On Tue, Apr 26, 2022 at 04:29:11AM +0200, Halil Pasic wrote:
> > > > > > > > > > > On Mon, 25 Apr 2022 09:59:55 -0400
> > > > > > > > > > > "Michael S. Tsirkin"  wrote:
> > > > > > > > > > >
> > > > > > > > > > > > On Mon, Apr 25, 2022 at 10:54:24AM +0200, Cornelia Huck 
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > On Mon, Apr 25 2022, "Michael S. Tsirkin" 
> > > > > > > > > > > > >  wrote:
> > > > > > > > > > > > > > On Mon, Apr 25, 2022 at 10:44:15AM +0800, Jason 
> > > > > > > > > > > > > > Wang wrote:
> > > > > > > > > > > > > > > This patch tries to implement the 
> > > > > > > > > > > > > > > synchronize_cbs() for ccw. For the
> > > > > > > > > > > > > > > vring_interrupt() that is called via 
> > > > > > > > > > > > > > > virtio_airq_handler(), the
> > > > > > > > > > > > > > > synchronization is simply done via the 
> > > > > > > > > > > > > > > airq_info's lock. For the
> > > > > > > > > > > > > > > vring_interrupt() that is called via 
> > > > > > > > > > > > > > > virtio_ccw_int_handler(), a per
> > > > > > > > > > > > > > > device spinlock for irq is introduced ans used in 
> > > > > > > > > > > > > > > the synchronization
> > > > > > > > > > > > > > > method.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Cc: Thomas Gleixner 
> > > > > > > > > > > > > > > Cc: Peter Zijlstra 
> > > > > > > > > > > > > > > Cc: "Paul E. McKenney" 
> > > > > > > > > > > > > > > Cc: Marc Zyngier 
> > > > > > > > > > > > > > > Cc: Halil Pasic 
> > > > > > > > > > > > > > > Cc: Cornelia Huck 
> > > > > > > > > > > > > > > Signed-off-by: Jason Wang 
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > This is the only one that is giving me pause. 
> > > > > > > > > > > > > > Halil, Cornelia,
> > > > > > > > > > > > > > should we be concerned about the performance impact 
> > > > > > > > > > > > > > here?
> > > > > > > > > > > > > > Any chance it can be tested?
> > > > > > > > > > > > > We can have a bunch of devices using the same airq 
> > > > > > > > > > > > > structure, and the
> > > > > > > > > > > > > sync cb creates a choke point, same as 
> > > > > > > > > > > > > registering/unregistering.
> > > > > > > > > > > > BTW can callbacks for multiple VQs run on multiple CPUs 
> > > > > > > > > > > > at the moment?
> > > > > > > > > > > I'm not sure I understand the question.
> > > > > > > > > > >
> > > > > > > > > > > I do think we can have multiple CPUs that are executing 
> > > > > > > > > > > some portion of
> > > > > > > > > > > virtio_ccw_int_handler(). So I guess the answer is yes. 
> > > > > > > > > > > Connie what do you think?
> > > > > > > > > > >
> > > > > > > > > > > On the other hand we could also end up serializing 
> > > > > > > > > > > synchronize_cbs()
> > > > > > > > > > > calls for different devices if they happen to use the 
> > > > > > > > > > > same airq_info. But
> > > > > > > > > > > this probably was not your question
> > > > > > > > > >
> > > > > > > > > > I am less concerned about  synchronize_cbs being slow and 
> > > > > > > > > > more about
> > > > > > > > > > the slowdown in interrupt processing itself.
> > > > > > > > > >
> > > > > > > > > > > > this patch serializes them on a spinlock.
> > > > > > > > > > > >
> > > > > > > > > > > Those could then pile up on the newly introduced spinlock.
> > > > > > > > > > >
> > > > > > > > > > > Regards,
> > > > > > > > > > > Halil
> > > > > > > > > > Hmm yea ... not good.
> > > > > > > > > Is there any other way to synchronize with all callbacks?
> > > > > > > >
> > > > > > > >
> > > > > > > > Maybe using rwlock as airq handler?
> > > > > > > >
> > > > > > > > Thanks
> > > > > > > >
> > > > > > >
> > > > > > > rwlock is still a shared cacheline bouncing between CPUs and
> > > > > > > a bunch of ordering instructions.
> > > > >
> > > > > Yes, but it should be faster than spinlocks anyhow.
> > > > >
> > > > > > > Maybe something per-cpu + some IPIs to run things on all CPUs 
> > > > > > > instead?
> > > > >
> > > > > Is this something like a customized version of 
> > > > > synchronzie_rcu_expedited()?
> > > >
> > > > With interrupts running in an RCU read size critical section?
> > >
> > > For vring_interrupt(), yes.
> > >
> > >
> > > > Quite possibly t

[PATCH v2 2/2] virtio: replace arch_has_restricted_virtio_memory_access()

2022-04-27 Thread Juergen Gross via Virtualization
Instead of using arch_has_restricted_virtio_memory_access() together
with CONFIG_ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS, replace those
with platform_has() and a new platform feature
PLATFORM_VIRTIO_RESTRICTED_MEM_ACCESS.

Signed-off-by: Juergen Gross 
---
V2:
- move setting of PLATFORM_VIRTIO_RESTRICTED_MEM_ACCESS in SEV case
  to sev_setup_arch().
---
 arch/s390/Kconfig|  1 -
 arch/s390/mm/init.c  | 13 +++--
 arch/x86/Kconfig |  1 -
 arch/x86/kernel/cpu/mshyperv.c   |  5 -
 arch/x86/mm/mem_encrypt.c|  6 --
 arch/x86/mm/mem_encrypt_amd.c|  4 
 drivers/virtio/Kconfig   |  6 --
 drivers/virtio/virtio.c  |  5 ++---
 include/linux/platform-feature.h |  3 ++-
 include/linux/virtio_config.h|  9 -
 10 files changed, 15 insertions(+), 38 deletions(-)

diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index e084c72104f8..f97a22ae69a8 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -772,7 +772,6 @@ menu "Virtualization"
 config PROTECTED_VIRTUALIZATION_GUEST
def_bool n
prompt "Protected virtualization guest support"
-   select ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS
help
  Select this option, if you want to be able to run this
  kernel as a protected virtualization KVM guest.
diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
index 86ffd0d51fd5..2c3b451813ed 100644
--- a/arch/s390/mm/init.c
+++ b/arch/s390/mm/init.c
@@ -31,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -168,22 +169,14 @@ bool force_dma_unencrypted(struct device *dev)
return is_prot_virt_guest();
 }
 
-#ifdef CONFIG_ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS
-
-int arch_has_restricted_virtio_memory_access(void)
-{
-   return is_prot_virt_guest();
-}
-EXPORT_SYMBOL(arch_has_restricted_virtio_memory_access);
-
-#endif
-
 /* protected virtualization */
 static void pv_init(void)
 {
if (!is_prot_virt_guest())
return;
 
+   platform_set(PLATFORM_VIRTIO_RESTRICTED_MEM_ACCESS);
+
/* make sure bounce buffers are shared */
swiotlb_force = SWIOTLB_FORCE;
swiotlb_init(1);
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index b0142e01002e..20ac72546ae4 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1515,7 +1515,6 @@ config X86_CPA_STATISTICS
 config X86_MEM_ENCRYPT
select ARCH_HAS_FORCE_DMA_UNENCRYPTED
select DYNAMIC_PHYSICAL_MASK
-   select ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS
def_bool n
 
 config AMD_MEM_ENCRYPT
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index 4b67094215bb..965518b9d14b 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -347,8 +348,10 @@ static void __init ms_hyperv_init_platform(void)
 #endif
/* Isolation VMs are unenlightened SEV-based VMs, thus this 
check: */
if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT)) {
-   if (hv_get_isolation_type() != HV_ISOLATION_TYPE_NONE)
+   if (hv_get_isolation_type() != HV_ISOLATION_TYPE_NONE) {
cc_set_vendor(CC_VENDOR_HYPERV);
+   
platform_set(PLATFORM_VIRTIO_RESTRICTED_MEM_ACCESS);
+   }
}
}
 
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 50d209939c66..9b6a7c98b2b1 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -76,9 +76,3 @@ void __init mem_encrypt_init(void)
 
print_mem_encrypt_feature_info();
 }
-
-int arch_has_restricted_virtio_memory_access(void)
-{
-   return cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT);
-}
-EXPORT_SYMBOL_GPL(arch_has_restricted_virtio_memory_access);
diff --git a/arch/x86/mm/mem_encrypt_amd.c b/arch/x86/mm/mem_encrypt_amd.c
index 6169053c2854..39b71084d36b 100644
--- a/arch/x86/mm/mem_encrypt_amd.c
+++ b/arch/x86/mm/mem_encrypt_amd.c
@@ -21,6 +21,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -206,6 +207,9 @@ void __init sev_setup_arch(void)
size = total_mem * 6 / 100;
size = clamp_val(size, IO_TLB_DEFAULT_SIZE, SZ_1G);
swiotlb_adjust_size(size);
+
+   /* Set restricted memory access for virtio. */
+   platform_set(PLATFORM_VIRTIO_RESTRICTED_MEM_ACCESS);
 }
 
 static unsigned long pg_level_to_pfn(int level, pte_t *kpte, pgprot_t 
*ret_prot)
diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
index b5adf6abd241..a6dc8b5846fe 100644
--- a/drivers/virtio/Kconfig
+++ b/drivers/virtio/Kconfig
@@ -6,12 +6,6 @@ config VIRTIO
  bus, such as CONFIG_VIRTIO_PCI, CONFIG_VIRTIO_MMIO, CONFIG_RPMSG
  or CONFIG_S390_GUEST.
 
-config ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS
-   bool
-   help
- This o

[PATCH v2 0/2] kernel: add new infrastructure for platform_has() support

2022-04-27 Thread Juergen Gross via Virtualization
In another patch series [1] the need has come up to have support for
a generic feature flag infrastructure.

This patch series is introducing that infrastructure and adds the first
use case.

I have decided to use a similar interface as the already known x86
cpu_has() function. As the new infrastructure is meant to be usable for
general and arch-specific feature flags, the flags are being spread
between a general bitmap and an arch specific one.

The bitmaps start all being zero, single features can be set or reset
at any time by using the related platform_[re]set_feature() functions.

The platform_has() function is using a simple test_bit() call for now,
further optimization might be added when needed.

[1]: 
https://lore.kernel.org/lkml/1650646263-22047-1-git-send-email-olekst...@gmail.com/T/#t

Juergen Gross (2):
  kernel: add platform_has() infrastructure
  virtio: replace arch_has_restricted_virtio_memory_access()

 MAINTAINERS|  8 
 arch/s390/Kconfig  |  1 -
 arch/s390/mm/init.c| 13 +++--
 arch/x86/Kconfig   |  1 -
 arch/x86/kernel/cpu/mshyperv.c |  5 -
 arch/x86/mm/mem_encrypt.c  |  6 --
 arch/x86/mm/mem_encrypt_amd.c  |  4 
 drivers/virtio/Kconfig |  6 --
 drivers/virtio/virtio.c|  5 ++---
 include/asm-generic/Kbuild |  1 +
 include/asm-generic/platform-feature.h |  8 
 include/linux/platform-feature.h   | 16 +++
 include/linux/virtio_config.h  |  9 -
 kernel/Makefile|  2 +-
 kernel/platform-feature.c  | 27 ++
 15 files changed, 74 insertions(+), 38 deletions(-)
 create mode 100644 include/asm-generic/platform-feature.h
 create mode 100644 include/linux/platform-feature.h
 create mode 100644 kernel/platform-feature.c

-- 
2.34.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2 1/2] kernel: add platform_has() infrastructure

2022-04-27 Thread Juergen Gross via Virtualization
Add a simple infrastructure for setting, resetting and querying
platform feature flags.

Flags can be either global or architecture specific.

Signed-off-by: Juergen Gross 
---
V2:
- rename set/reset functions to platform_[set|clear]() (Boris Petkov,
  Heiko Carstens)
- move function implementations to c file (Boris Petkov)
---
 MAINTAINERS|  8 
 include/asm-generic/Kbuild |  1 +
 include/asm-generic/platform-feature.h |  8 
 include/linux/platform-feature.h   | 15 ++
 kernel/Makefile|  2 +-
 kernel/platform-feature.c  | 27 ++
 6 files changed, 60 insertions(+), 1 deletion(-)
 create mode 100644 include/asm-generic/platform-feature.h
 create mode 100644 include/linux/platform-feature.h
 create mode 100644 kernel/platform-feature.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 5e8c2f611766..eb943f089eda 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -15650,6 +15650,14 @@ S: Maintained
 F: Documentation/devicetree/bindings/iio/chemical/plantower,pms7003.yaml
 F: drivers/iio/chemical/pms7003.c
 
+PLATFORM FEATURE INFRASTRUCTURE
+M: Juergen Gross 
+S: Maintained
+F: arch/*/include/asm/platform-feature.h
+F: include/asm-generic/platform-feature.h
+F: include/linux/platform-feature.h
+F: kernel/platform-feature.c
+
 PLDMFW LIBRARY
 M: Jacob Keller 
 S: Maintained
diff --git a/include/asm-generic/Kbuild b/include/asm-generic/Kbuild
index 302506bbc2a4..8e47d483b524 100644
--- a/include/asm-generic/Kbuild
+++ b/include/asm-generic/Kbuild
@@ -44,6 +44,7 @@ mandatory-y += msi.h
 mandatory-y += pci.h
 mandatory-y += percpu.h
 mandatory-y += pgalloc.h
+mandatory-y += platform-feature.h
 mandatory-y += preempt.h
 mandatory-y += rwonce.h
 mandatory-y += sections.h
diff --git a/include/asm-generic/platform-feature.h 
b/include/asm-generic/platform-feature.h
new file mode 100644
index ..4b0af3d51588
--- /dev/null
+++ b/include/asm-generic/platform-feature.h
@@ -0,0 +1,8 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_GENERIC_PLATFORM_FEATURE_H
+#define _ASM_GENERIC_PLATFORM_FEATURE_H
+
+/* Number of arch specific feature flags. */
+#define PLATFORM_ARCH_FEAT_N   0
+
+#endif /* _ASM_GENERIC_PLATFORM_FEATURE_H */
diff --git a/include/linux/platform-feature.h b/include/linux/platform-feature.h
new file mode 100644
index ..6ed859928b97
--- /dev/null
+++ b/include/linux/platform-feature.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _PLATFORM_FEATURE_H
+#define _PLATFORM_FEATURE_H
+
+#include 
+#include 
+
+/* The platform features are starting with the architecture specific ones. */
+#define PLATFORM_FEAT_N(0 + 
PLATFORM_ARCH_FEAT_N)
+
+void platform_set(unsigned int feature);
+void platform_clear(unsigned int feature);
+bool platform_has(unsigned int feature);
+
+#endif /* _PLATFORM_FEATURE_H */
diff --git a/kernel/Makefile b/kernel/Makefile
index 847a82bfe0e3..2f412f80110d 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -7,7 +7,7 @@ obj-y = fork.o exec_domain.o panic.o \
cpu.o exit.o softirq.o resource.o \
sysctl.o capability.o ptrace.o user.o \
signal.o sys.o umh.o workqueue.o pid.o task_work.o \
-   extable.o params.o \
+   extable.o params.o platform-feature.o \
kthread.o sys_ni.o nsproxy.o \
notifier.o ksysfs.o cred.o reboot.o \
async.o range.o smpboot.o ucount.o regset.o
diff --git a/kernel/platform-feature.c b/kernel/platform-feature.c
new file mode 100644
index ..cb6a6c3e4fed
--- /dev/null
+++ b/kernel/platform-feature.c
@@ -0,0 +1,27 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include 
+#include 
+#include 
+#include 
+
+#define PLATFORM_FEAT_ARRAY_SZ  BITS_TO_LONGS(PLATFORM_FEAT_N)
+static unsigned long __read_mostly platform_features[PLATFORM_FEAT_ARRAY_SZ];
+
+void platform_set(unsigned int feature)
+{
+   set_bit(feature, platform_features);
+}
+EXPORT_SYMBOL_GPL(platform_set);
+
+void platform_clear(unsigned int feature)
+{
+   clear_bit(feature, platform_features);
+}
+EXPORT_SYMBOL_GPL(platform_clear);
+
+bool platform_has(unsigned int feature)
+{
+   return test_bit(feature, platform_features);
+}
+EXPORT_SYMBOL_GPL(platform_has);
-- 
2.34.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


RE: [PATCH v2 2/2] virtio: replace arch_has_restricted_virtio_memory_access()

2022-04-27 Thread Michael Kelley (LINUX) via Virtualization
From: Juergen Gross  Sent: Wednesday, April 27, 2022 8:34 AM
> 
> Instead of using arch_has_restricted_virtio_memory_access() together
> with CONFIG_ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS, replace those
> with platform_has() and a new platform feature
> PLATFORM_VIRTIO_RESTRICTED_MEM_ACCESS.
> 
> Signed-off-by: Juergen Gross 
> ---
> V2:
> - move setting of PLATFORM_VIRTIO_RESTRICTED_MEM_ACCESS in SEV case
>   to sev_setup_arch().
> ---
>  arch/s390/Kconfig|  1 -
>  arch/s390/mm/init.c  | 13 +++--
>  arch/x86/Kconfig |  1 -
>  arch/x86/kernel/cpu/mshyperv.c   |  5 -
>  arch/x86/mm/mem_encrypt.c|  6 --
>  arch/x86/mm/mem_encrypt_amd.c|  4 
>  drivers/virtio/Kconfig   |  6 --
>  drivers/virtio/virtio.c  |  5 ++---
>  include/linux/platform-feature.h |  3 ++-
>  include/linux/virtio_config.h|  9 -
>  10 files changed, 15 insertions(+), 38 deletions(-)
> 
> diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
> index e084c72104f8..f97a22ae69a8 100644
> --- a/arch/s390/Kconfig
> +++ b/arch/s390/Kconfig
> @@ -772,7 +772,6 @@ menu "Virtualization"
>  config PROTECTED_VIRTUALIZATION_GUEST
>   def_bool n
>   prompt "Protected virtualization guest support"
> - select ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS
>   help
> Select this option, if you want to be able to run this
> kernel as a protected virtualization KVM guest.
> diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
> index 86ffd0d51fd5..2c3b451813ed 100644
> --- a/arch/s390/mm/init.c
> +++ b/arch/s390/mm/init.c
> @@ -31,6 +31,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -168,22 +169,14 @@ bool force_dma_unencrypted(struct device *dev)
>   return is_prot_virt_guest();
>  }
> 
> -#ifdef CONFIG_ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS
> -
> -int arch_has_restricted_virtio_memory_access(void)
> -{
> - return is_prot_virt_guest();
> -}
> -EXPORT_SYMBOL(arch_has_restricted_virtio_memory_access);
> -
> -#endif
> -
>  /* protected virtualization */
>  static void pv_init(void)
>  {
>   if (!is_prot_virt_guest())
>   return;
> 
> + platform_set(PLATFORM_VIRTIO_RESTRICTED_MEM_ACCESS);
> +
>   /* make sure bounce buffers are shared */
>   swiotlb_force = SWIOTLB_FORCE;
>   swiotlb_init(1);
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index b0142e01002e..20ac72546ae4 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -1515,7 +1515,6 @@ config X86_CPA_STATISTICS
>  config X86_MEM_ENCRYPT
>   select ARCH_HAS_FORCE_DMA_UNENCRYPTED
>   select DYNAMIC_PHYSICAL_MASK
> - select ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS
>   def_bool n
> 
>  config AMD_MEM_ENCRYPT
> diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
> index 4b67094215bb..965518b9d14b 100644
> --- a/arch/x86/kernel/cpu/mshyperv.c
> +++ b/arch/x86/kernel/cpu/mshyperv.c
> @@ -19,6 +19,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -347,8 +348,10 @@ static void __init ms_hyperv_init_platform(void)
>  #endif
>   /* Isolation VMs are unenlightened SEV-based VMs, thus this 
> check: */
>   if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT)) {
> - if (hv_get_isolation_type() != HV_ISOLATION_TYPE_NONE)
> + if (hv_get_isolation_type() != HV_ISOLATION_TYPE_NONE) {
>   cc_set_vendor(CC_VENDOR_HYPERV);
> +
>   platform_set(PLATFORM_VIRTIO_RESTRICTED_MEM_ACCESS);
> + }
>   }
>   }
> 

Unless I'm misunderstanding something, the Hyper-V specific change isn't
needed.   Hyper-V doesn't support virtio in the first place, so it's a bit
unexpected be setting a virtio-related flag in Hyper-V code.   Also, Hyper-V
guests call sev_setup_arch() with CC_ATTR_GUEST_MEM_ENCRYPT set,
so this virtio-related flag will get set anyway via that path.

Michael

> diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
> index 50d209939c66..9b6a7c98b2b1 100644
> --- a/arch/x86/mm/mem_encrypt.c
> +++ b/arch/x86/mm/mem_encrypt.c
> @@ -76,9 +76,3 @@ void __init mem_encrypt_init(void)
> 
>   print_mem_encrypt_feature_info();
>  }
> -
> -int arch_has_restricted_virtio_memory_access(void)
> -{
> - return cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT);
> -}
> -EXPORT_SYMBOL_GPL(arch_has_restricted_virtio_memory_access);
> diff --git a/arch/x86/mm/mem_encrypt_amd.c b/arch/x86/mm/mem_encrypt_amd.c
> index 6169053c2854..39b71084d36b 100644
> --- a/arch/x86/mm/mem_encrypt_amd.c
> +++ b/arch/x86/mm/mem_encrypt_amd.c
> @@ -21,6 +21,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
> 
>  #include 
>  #include 
> @@ -206,6 +207,9 @@ void __init sev_setup_arch(void)
>   size = total_mem * 6 / 100;
>   size = clamp_val(size, IO_TLB_DEFAULT_SIZE, SZ_1G);
>   swiotlb_

Re: [PATCH v2 2/2] virtio: replace arch_has_restricted_virtio_memory_access()

2022-04-27 Thread Juergen Gross via Virtualization

On 27.04.22 18:30, Michael Kelley (LINUX) wrote:

From: Juergen Gross  Sent: Wednesday, April 27, 2022 8:34 AM


Instead of using arch_has_restricted_virtio_memory_access() together
with CONFIG_ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS, replace those
with platform_has() and a new platform feature
PLATFORM_VIRTIO_RESTRICTED_MEM_ACCESS.

Signed-off-by: Juergen Gross 
---
V2:
- move setting of PLATFORM_VIRTIO_RESTRICTED_MEM_ACCESS in SEV case
   to sev_setup_arch().
---
  arch/s390/Kconfig|  1 -
  arch/s390/mm/init.c  | 13 +++--
  arch/x86/Kconfig |  1 -
  arch/x86/kernel/cpu/mshyperv.c   |  5 -
  arch/x86/mm/mem_encrypt.c|  6 --
  arch/x86/mm/mem_encrypt_amd.c|  4 
  drivers/virtio/Kconfig   |  6 --
  drivers/virtio/virtio.c  |  5 ++---
  include/linux/platform-feature.h |  3 ++-
  include/linux/virtio_config.h|  9 -
  10 files changed, 15 insertions(+), 38 deletions(-)

diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index e084c72104f8..f97a22ae69a8 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -772,7 +772,6 @@ menu "Virtualization"
  config PROTECTED_VIRTUALIZATION_GUEST
def_bool n
prompt "Protected virtualization guest support"
-   select ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS
help
  Select this option, if you want to be able to run this
  kernel as a protected virtualization KVM guest.
diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
index 86ffd0d51fd5..2c3b451813ed 100644
--- a/arch/s390/mm/init.c
+++ b/arch/s390/mm/init.c
@@ -31,6 +31,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 
  #include 
@@ -168,22 +169,14 @@ bool force_dma_unencrypted(struct device *dev)
return is_prot_virt_guest();
  }

-#ifdef CONFIG_ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS
-
-int arch_has_restricted_virtio_memory_access(void)
-{
-   return is_prot_virt_guest();
-}
-EXPORT_SYMBOL(arch_has_restricted_virtio_memory_access);
-
-#endif
-
  /* protected virtualization */
  static void pv_init(void)
  {
if (!is_prot_virt_guest())
return;

+   platform_set(PLATFORM_VIRTIO_RESTRICTED_MEM_ACCESS);
+
/* make sure bounce buffers are shared */
swiotlb_force = SWIOTLB_FORCE;
swiotlb_init(1);
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index b0142e01002e..20ac72546ae4 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1515,7 +1515,6 @@ config X86_CPA_STATISTICS
  config X86_MEM_ENCRYPT
select ARCH_HAS_FORCE_DMA_UNENCRYPTED
select DYNAMIC_PHYSICAL_MASK
-   select ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS
def_bool n

  config AMD_MEM_ENCRYPT
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index 4b67094215bb..965518b9d14b 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -19,6 +19,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 
  #include 
@@ -347,8 +348,10 @@ static void __init ms_hyperv_init_platform(void)
  #endif
/* Isolation VMs are unenlightened SEV-based VMs, thus this 
check: */
if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT)) {
-   if (hv_get_isolation_type() != HV_ISOLATION_TYPE_NONE)
+   if (hv_get_isolation_type() != HV_ISOLATION_TYPE_NONE) {
cc_set_vendor(CC_VENDOR_HYPERV);
+
platform_set(PLATFORM_VIRTIO_RESTRICTED_MEM_ACCESS);
+   }
}
}



Unless I'm misunderstanding something, the Hyper-V specific change isn't
needed.   Hyper-V doesn't support virtio in the first place, so it's a bit
unexpected be setting a virtio-related flag in Hyper-V code.   Also, Hyper-V
guests call sev_setup_arch() with CC_ATTR_GUEST_MEM_ENCRYPT set,
so this virtio-related flag will get set anyway via that path.


Okay, thanks. Will drop that chunk then.


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH V2 03/11] perf/x86: Add support for TSC in nanoseconds as a perf event clock

2022-04-27 Thread Thomas Gleixner
On Tue, Apr 26 2022 at 09:51, Adrian Hunter wrote:
> On 25/04/22 20:05, Thomas Gleixner wrote:
>> On Mon, Apr 25 2022 at 16:15, Adrian Hunter wrote:
>>> On 25/04/22 12:32, Thomas Gleixner wrote:
 It's hillarious, that we still cling to this pvclock abomination, while
 we happily expose TSC deadline timer to the guest. TSC virt scaling was
 implemented in hardware for a reason.
>>>
>>> So you are talking about changing VMX TCS Offset on every VM-Entry to try 
>>> to hide
>>> the time jumps when the VM is scheduled out?  Or neglect that and just let 
>>> the time
>>> jumps happen?
>>>
>>> If changing VMX TCS Offset, how can TSC be kept consistent between each 
>>> VCPU i.e.
>>> wouldn't that mean each VCPU has to have the same VMX TSC Offset?
>> 
>> Obviously so. That's the only thing which makes sense, no?
>
> [ Sending this again, because I notice I messed up the email "From" ]
>
> But wouldn't that mean changing all the VCPUs VMX TSC Offset at the same time,
> which means when none are currently executing?  How could that be done?

Why would you change TSC offset after the point where a VM is started
and why would it be different per vCPU?

Time is global and time moves on when a vCPU is scheduled out. Anything
else is bonkers, really. If the hypervisor tries to screw with that then
how does the guest do timekeeping in a consistent way?

CLOCK_REALTIME = CLOCK_MONOTONIC + offset

That offset changes when something sets the clock, i.e. clock_settime(),
settimeofday() or adjtimex() in case that NTP cannot compensate or for
the beloved leap seconds adjustment. At any other time the offset is
constant.

CLOCK_MONOTONIC is derived from the underlying clocksource which is
expected to increment with constant frequency and that has to be
consistent accross _all_ vCPUs of a particular VM.

So how would a hypervisor 'hide' scheduled out time w/o screwing up
timekeeping completely?

The guest TSC which is based on the host TSC is:

guestTSC = offset + hostTSC * factor;

If you make offset different between guest vCPUs then timekeeping in the
guest is screwed.

The whole point of that paravirt clock was to handle migration between
hosts which did not have the VMCS TSC scaling/offset mechanism. The CPUs
which did not have that went EOL at least 10 years ago.

So what are you concerned about?

Thanks,

tglx
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH V3 6/9] virtio-ccw: implement synchronize_cbs()

2022-04-27 Thread Halil Pasic
On Wed, 27 Apr 2022 11:27:03 +0200
Cornelia Huck  wrote:

> On Tue, Apr 26 2022, "Michael S. Tsirkin"  wrote:
> 
> > On Tue, Apr 26, 2022 at 05:47:17PM +0200, Cornelia Huck wrote:  
> >> On Mon, Apr 25 2022, "Michael S. Tsirkin"  wrote:
> >>   
> >> > On Mon, Apr 25, 2022 at 11:53:24PM -0400, Michael S. Tsirkin wrote:  
> >> >> On Tue, Apr 26, 2022 at 11:42:45AM +0800, Jason Wang wrote:  
> >> >> > 
> >> >> > 在 2022/4/26 11:38, Michael S. Tsirkin 写道:  
> >> >> > > On Mon, Apr 25, 2022 at 11:35:41PM -0400, Michael S. Tsirkin wrote: 
> >> >> > >  
> >> >> > > > On Tue, Apr 26, 2022 at 04:29:11AM +0200, Halil Pasic wrote:  
> >> >> > > > > On Mon, 25 Apr 2022 09:59:55 -0400
> >> >> > > > > "Michael S. Tsirkin"  wrote:
> >> >> > > > >   
> >> >> > > > > > On Mon, Apr 25, 2022 at 10:54:24AM +0200, Cornelia Huck 
> >> >> > > > > > wrote:  
> >> >> > > > > > > On Mon, Apr 25 2022, "Michael S. Tsirkin"  
> >> >> > > > > > > wrote:  
> >> >> > > > > > > > On Mon, Apr 25, 2022 at 10:44:15AM +0800, Jason Wang 
> >> >> > > > > > > > wrote:  
> >> >> > > > > > > > > This patch tries to implement the synchronize_cbs() for 
> >> >> > > > > > > > > ccw. For the
> >> >> > > > > > > > > vring_interrupt() that is called via 
> >> >> > > > > > > > > virtio_airq_handler(), the
> >> >> > > > > > > > > synchronization is simply done via the airq_info's 
> >> >> > > > > > > > > lock. For the
> >> >> > > > > > > > > vring_interrupt() that is called via 
> >> >> > > > > > > > > virtio_ccw_int_handler(), a per
> >> >> > > > > > > > > device spinlock for irq is introduced ans used in the 
> >> >> > > > > > > > > synchronization
> >> >> > > > > > > > > method.
> >> >> > > > > > > > > 
> >> >> > > > > > > > > Cc: Thomas Gleixner 
> >> >> > > > > > > > > Cc: Peter Zijlstra 
> >> >> > > > > > > > > Cc: "Paul E. McKenney" 
> >> >> > > > > > > > > Cc: Marc Zyngier 
> >> >> > > > > > > > > Cc: Halil Pasic 
> >> >> > > > > > > > > Cc: Cornelia Huck 
> >> >> > > > > > > > > Signed-off-by: Jason Wang   
> >> >> > > > > > > > 
> >> >> > > > > > > > This is the only one that is giving me pause. Halil, 
> >> >> > > > > > > > Cornelia,
> >> >> > > > > > > > should we be concerned about the performance impact here?
> >> >> > > > > > > > Any chance it can be tested?  
> >> >> > > > > > > We can have a bunch of devices using the same airq 
> >> >> > > > > > > structure, and the
> >> >> > > > > > > sync cb creates a choke point, same as 
> >> >> > > > > > > registering/unregistering.  
> >> >> > > > > > BTW can callbacks for multiple VQs run on multiple CPUs at 
> >> >> > > > > > the moment?  
> >> >> > > > > I'm not sure I understand the question.
> >> >> > > > > 
> >> >> > > > > I do think we can have multiple CPUs that are executing some 
> >> >> > > > > portion of
> >> >> > > > > virtio_ccw_int_handler(). So I guess the answer is yes. Connie 
> >> >> > > > > what do you think?
> >> >> > > > > 
> >> >> > > > > On the other hand we could also end up serializing 
> >> >> > > > > synchronize_cbs()
> >> >> > > > > calls for different devices if they happen to use the same 
> >> >> > > > > airq_info. But
> >> >> > > > > this probably was not your question  
> >> >> > > > 
> >> >> > > > I am less concerned about  synchronize_cbs being slow and more 
> >> >> > > > about
> >> >> > > > the slowdown in interrupt processing itself.
> >> >> > > >   
> >> >> > > > > > this patch serializes them on a spinlock.
> >> >> > > > > >   
> >> >> > > > > Those could then pile up on the newly introduced spinlock.  
> >> 
> >> How bad would that be in practice? IIUC, we hit on the spinlock when
> >> - doing synchronize_cbs (should be rare)
> >> - processing queue interrupts for devices using per-device indicators
> >>   (which is the non-preferred path, which I would basically only expect
> >>   when running on an ancient or non-standard hypervisor)  
> >
> > this one is my concern. I am worried serializing everything on a single lock
> > will drastically regress performance here.  
> 
> Yeah, that case could get much worse. OTOH, how likely is it that any
> setup that runs a recent kernel will actually end up with devices using
> per-device indicators? Anything running under a QEMU released in the
> last couple of years is unlikely to not use airqs, I think. Halil, do
> you think that the classic indicator setup would be more common on any
> non-QEMU hypervisors?
> 

I really don't know. My opinion is that, two stages indicators are kind
of recommended for anybody who cares about notifications performance.

> IOW, how much effort is it worth spending on optimizing this case? We
> certainly should explore any simple solutions, but I don't think we need
> to twist ourselves into pretzels to solve it.
> 

Frankly, I would be fine with an rwlock based solution as proposed by
Jason. My rationale is: we recommend two stage indicators, and the two
stage indicators are already encumbered by an rwlock on the interrupt
path. Yes, the coalescence of adapter interrupts is architectu

Re: [PATCH V3 6/9] virtio-ccw: implement synchronize_cbs()

2022-04-27 Thread Jason Wang
On Thu, Apr 28, 2022 at 10:43 AM Halil Pasic  wrote:
>
> On Wed, 27 Apr 2022 11:27:03 +0200
> Cornelia Huck  wrote:
>
> > On Tue, Apr 26 2022, "Michael S. Tsirkin"  wrote:
> >
> > > On Tue, Apr 26, 2022 at 05:47:17PM +0200, Cornelia Huck wrote:
> > >> On Mon, Apr 25 2022, "Michael S. Tsirkin"  wrote:
> > >>
> > >> > On Mon, Apr 25, 2022 at 11:53:24PM -0400, Michael S. Tsirkin wrote:
> > >> >> On Tue, Apr 26, 2022 at 11:42:45AM +0800, Jason Wang wrote:
> > >> >> >
> > >> >> > 在 2022/4/26 11:38, Michael S. Tsirkin 写道:
> > >> >> > > On Mon, Apr 25, 2022 at 11:35:41PM -0400, Michael S. Tsirkin 
> > >> >> > > wrote:
> > >> >> > > > On Tue, Apr 26, 2022 at 04:29:11AM +0200, Halil Pasic wrote:
> > >> >> > > > > On Mon, 25 Apr 2022 09:59:55 -0400
> > >> >> > > > > "Michael S. Tsirkin"  wrote:
> > >> >> > > > >
> > >> >> > > > > > On Mon, Apr 25, 2022 at 10:54:24AM +0200, Cornelia Huck 
> > >> >> > > > > > wrote:
> > >> >> > > > > > > On Mon, Apr 25 2022, "Michael S. Tsirkin" 
> > >> >> > > > > > >  wrote:
> > >> >> > > > > > > > On Mon, Apr 25, 2022 at 10:44:15AM +0800, Jason Wang 
> > >> >> > > > > > > > wrote:
> > >> >> > > > > > > > > This patch tries to implement the synchronize_cbs() 
> > >> >> > > > > > > > > for ccw. For the
> > >> >> > > > > > > > > vring_interrupt() that is called via 
> > >> >> > > > > > > > > virtio_airq_handler(), the
> > >> >> > > > > > > > > synchronization is simply done via the airq_info's 
> > >> >> > > > > > > > > lock. For the
> > >> >> > > > > > > > > vring_interrupt() that is called via 
> > >> >> > > > > > > > > virtio_ccw_int_handler(), a per
> > >> >> > > > > > > > > device spinlock for irq is introduced ans used in the 
> > >> >> > > > > > > > > synchronization
> > >> >> > > > > > > > > method.
> > >> >> > > > > > > > >
> > >> >> > > > > > > > > Cc: Thomas Gleixner 
> > >> >> > > > > > > > > Cc: Peter Zijlstra 
> > >> >> > > > > > > > > Cc: "Paul E. McKenney" 
> > >> >> > > > > > > > > Cc: Marc Zyngier 
> > >> >> > > > > > > > > Cc: Halil Pasic 
> > >> >> > > > > > > > > Cc: Cornelia Huck 
> > >> >> > > > > > > > > Signed-off-by: Jason Wang 
> > >> >> > > > > > > >
> > >> >> > > > > > > > This is the only one that is giving me pause. Halil, 
> > >> >> > > > > > > > Cornelia,
> > >> >> > > > > > > > should we be concerned about the performance impact 
> > >> >> > > > > > > > here?
> > >> >> > > > > > > > Any chance it can be tested?
> > >> >> > > > > > > We can have a bunch of devices using the same airq 
> > >> >> > > > > > > structure, and the
> > >> >> > > > > > > sync cb creates a choke point, same as 
> > >> >> > > > > > > registering/unregistering.
> > >> >> > > > > > BTW can callbacks for multiple VQs run on multiple CPUs at 
> > >> >> > > > > > the moment?
> > >> >> > > > > I'm not sure I understand the question.
> > >> >> > > > >
> > >> >> > > > > I do think we can have multiple CPUs that are executing some 
> > >> >> > > > > portion of
> > >> >> > > > > virtio_ccw_int_handler(). So I guess the answer is yes. 
> > >> >> > > > > Connie what do you think?
> > >> >> > > > >
> > >> >> > > > > On the other hand we could also end up serializing 
> > >> >> > > > > synchronize_cbs()
> > >> >> > > > > calls for different devices if they happen to use the same 
> > >> >> > > > > airq_info. But
> > >> >> > > > > this probably was not your question
> > >> >> > > >
> > >> >> > > > I am less concerned about  synchronize_cbs being slow and more 
> > >> >> > > > about
> > >> >> > > > the slowdown in interrupt processing itself.
> > >> >> > > >
> > >> >> > > > > > this patch serializes them on a spinlock.
> > >> >> > > > > >
> > >> >> > > > > Those could then pile up on the newly introduced spinlock.
> > >>
> > >> How bad would that be in practice? IIUC, we hit on the spinlock when
> > >> - doing synchronize_cbs (should be rare)
> > >> - processing queue interrupts for devices using per-device indicators
> > >>   (which is the non-preferred path, which I would basically only expect
> > >>   when running on an ancient or non-standard hypervisor)
> > >
> > > this one is my concern. I am worried serializing everything on a single 
> > > lock
> > > will drastically regress performance here.
> >
> > Yeah, that case could get much worse. OTOH, how likely is it that any
> > setup that runs a recent kernel will actually end up with devices using
> > per-device indicators? Anything running under a QEMU released in the
> > last couple of years is unlikely to not use airqs, I think. Halil, do
> > you think that the classic indicator setup would be more common on any
> > non-QEMU hypervisors?
> >
>
> I really don't know. My opinion is that, two stages indicators are kind
> of recommended for anybody who cares about notifications performance.
>
> > IOW, how much effort is it worth spending on optimizing this case? We
> > certainly should explore any simple solutions, but I don't think we need
> > to twist ourselves into pretzels to solve it.
> >
>
> Frankly, I would be fine with an rwlock based

Re: [PATCH v2 1/2] vdpa: add the check for id_table in struct vdpa_mgmt_dev

2022-04-27 Thread Jason Wang
On Thu, Apr 28, 2022 at 9:56 AM Cindy Lu  wrote:
>
> On Wed, Apr 27, 2022 at 12:04 PM Jason Wang  wrote:
> >
> >
> > 在 2022/4/27 10:01, Cindy Lu 写道:
> > > On Mon, Apr 25, 2022 at 5:00 PM Jason Wang  wrote:
> > >> On Mon, Apr 25, 2022 at 2:27 PM Cindy Lu  wrote:
> > >>> To support the dynamic ids in vp_vdpa, we need to add the check for
> > >>> id table. If the id table is NULL, will not set the device type
> > >>>
> > >>> Signed-off-by: Cindy Lu 
> > >>> ---
> > >>>   drivers/vdpa/vdpa.c | 11 +++
> > >>>   1 file changed, 7 insertions(+), 4 deletions(-)
> > >>>
> > >>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
> > >>> index 1ea525433a5c..09edd92cede0 100644
> > >>> --- a/drivers/vdpa/vdpa.c
> > >>> +++ b/drivers/vdpa/vdpa.c
> > >>> @@ -492,10 +492,13 @@ static int vdpa_mgmtdev_fill(const struct 
> > >>> vdpa_mgmt_dev *mdev, struct sk_buff *m
> > >>>  if (err)
> > >>>  goto msg_err;
> > >>>
> > >>> -   while (mdev->id_table[i].device) {
> > >>> -   if (mdev->id_table[i].device <= 63)
> > >>> -   supported_classes |= 
> > >>> BIT_ULL(mdev->id_table[i].device);
> > >>> -   i++;
> > >>> +   if (mdev->id_table != NULL) {
> > >>> +   while (mdev->id_table[i].device) {
> > >>> +   if (mdev->id_table[i].device <= 63)
> > >>> +   supported_classes |=
> > >>> +   
> > >>> BIT_ULL(mdev->id_table[i].device);
> > >>> +   i++;
> > >>> +   }
> > >>>  }
> > >> This will cause 0 to be advertised as the supported classes.
> > >>
> > >> I wonder if we can simply use VIRTIO_DEV_ANY_ID here (and need to
> > >> export it to via uAPI probably).
> > >>
> > >> Thanks
> > >>
> > > like the below one? not sure if this ok to use like this?
> > > static struct virtio_device_id vp_vdpa_id_table[] = {
> > > { VIRTIO_DEV_ANY_ID, VIRTIO_DEV_ANY_ID },
> > > { 0 },
> > > };
> >
> >
> > Something like this.
> >
> > Thanks
> >
> >
> I have checked the code, this maybe can not work, because the
> #define VIRTIO_DEV_ANY_ID 0x
>  it want't work in
> supported_classes |= BIT_ULL(mdev->id_table[i].device);
> if we chane to
> supported_classes |= VIRTIO_DEV_ANY_ID;
> the vdpa dev show will be
> pci/:00:04.0:
>   supported_classes net block < unknown class > < unknown class > <
> unknown class > < unknown class > < unknown class > < unknow>
>   max_supported_vqs 3

That's why I suggest exporting the ANY_ID via uAPI and then we can fix
the userspace.

>   dev_features CSUM GUEST_CSUM CTRL_GUEST_OFFLOADS MAC GUEST_TSO4
> GUEST_TSO6 GUEST_ECN GUEST_UFO HOST_TSO4 HOST_TSO6 HOST_
>  I think we can use
> static struct virtio_device_id id_table[] = {
> { VIRTIO_ID_NET, VIRTIO_DEV_ANY_ID },
> { 0 },
> };
>  if we need to add another type of device, we can add the device id at that 
> type
>

My point is that, we have supported any virtio devices before. But
after this change, we only support virtio-net.

So if we choose to use id arrays, let's just add all possible virtio
devices that are supported by the kernel here.

Thanks

> Thanks
> cindy
>
>
>
> > >
> > >
> > >>>  if (nla_put_u64_64bit(msg, VDPA_ATTR_MGMTDEV_SUPPORTED_CLASSES,
> > >>> --
> > >>> 2.34.1
> > >>>
> >
>

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v2 1/2] vdpa: add the check for id_table in struct vdpa_mgmt_dev

2022-04-27 Thread Jason Wang
On Thu, Apr 28, 2022 at 11:21 AM Cindy Lu  wrote:
>
> On Thu, Apr 28, 2022 at 11:07 AM Jason Wang  wrote:
> >
> > On Thu, Apr 28, 2022 at 9:56 AM Cindy Lu  wrote:
> > >
> > > On Wed, Apr 27, 2022 at 12:04 PM Jason Wang  wrote:
> > > >
> > > >
> > > > 在 2022/4/27 10:01, Cindy Lu 写道:
> > > > > On Mon, Apr 25, 2022 at 5:00 PM Jason Wang  
> > > > > wrote:
> > > > >> On Mon, Apr 25, 2022 at 2:27 PM Cindy Lu  wrote:
> > > > >>> To support the dynamic ids in vp_vdpa, we need to add the check for
> > > > >>> id table. If the id table is NULL, will not set the device type
> > > > >>>
> > > > >>> Signed-off-by: Cindy Lu 
> > > > >>> ---
> > > > >>>   drivers/vdpa/vdpa.c | 11 +++
> > > > >>>   1 file changed, 7 insertions(+), 4 deletions(-)
> > > > >>>
> > > > >>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
> > > > >>> index 1ea525433a5c..09edd92cede0 100644
> > > > >>> --- a/drivers/vdpa/vdpa.c
> > > > >>> +++ b/drivers/vdpa/vdpa.c
> > > > >>> @@ -492,10 +492,13 @@ static int vdpa_mgmtdev_fill(const struct 
> > > > >>> vdpa_mgmt_dev *mdev, struct sk_buff *m
> > > > >>>  if (err)
> > > > >>>  goto msg_err;
> > > > >>>
> > > > >>> -   while (mdev->id_table[i].device) {
> > > > >>> -   if (mdev->id_table[i].device <= 63)
> > > > >>> -   supported_classes |= 
> > > > >>> BIT_ULL(mdev->id_table[i].device);
> > > > >>> -   i++;
> > > > >>> +   if (mdev->id_table != NULL) {
> > > > >>> +   while (mdev->id_table[i].device) {
> > > > >>> +   if (mdev->id_table[i].device <= 63)
> > > > >>> +   supported_classes |=
> > > > >>> +   
> > > > >>> BIT_ULL(mdev->id_table[i].device);
> > > > >>> +   i++;
> > > > >>> +   }
> > > > >>>  }
> > > > >> This will cause 0 to be advertised as the supported classes.
> > > > >>
> > > > >> I wonder if we can simply use VIRTIO_DEV_ANY_ID here (and need to
> > > > >> export it to via uAPI probably).
> > > > >>
> > > > >> Thanks
> > > > >>
> > > > > like the below one? not sure if this ok to use like this?
> > > > > static struct virtio_device_id vp_vdpa_id_table[] = {
> > > > > { VIRTIO_DEV_ANY_ID, VIRTIO_DEV_ANY_ID },
> > > > > { 0 },
> > > > > };
> > > >
> > > >
> > > > Something like this.
> > > >
> > > > Thanks
> > > >
> > > >
> > > I have checked the code, this maybe can not work, because the
> > > #define VIRTIO_DEV_ANY_ID 0x
> > >  it want't work in
> > > supported_classes |= BIT_ULL(mdev->id_table[i].device);
> > > if we chane to
> > > supported_classes |= VIRTIO_DEV_ANY_ID;
> > > the vdpa dev show will be
> > > pci/:00:04.0:
> > >   supported_classes net block < unknown class > < unknown class > <
> > > unknown class > < unknown class > < unknown class > < unknow>
> > >   max_supported_vqs 3
> >
> > That's why I suggest exporting the ANY_ID via uAPI and then we can fix
> > the userspace.
> >
> sure.But I think maybe we can fix this in another patch, since it
> related to userspace

Yes.

>
> > >   dev_features CSUM GUEST_CSUM CTRL_GUEST_OFFLOADS MAC GUEST_TSO4
> > > GUEST_TSO6 GUEST_ECN GUEST_UFO HOST_TSO4 HOST_TSO6 HOST_
> > >  I think we can use
> > > static struct virtio_device_id id_table[] = {
> > > { VIRTIO_ID_NET, VIRTIO_DEV_ANY_ID },
> > > { 0 },
> > > };
> > >  if we need to add another type of device, we can add the device id at 
> > > that type
> > >
> >
> > My point is that, we have supported any virtio devices before. But
> > after this change, we only support virtio-net.
> >
> > So if we choose to use id arrays, let's just add all possible virtio
> > devices that are supported by the kernel here.
> >
> sorry, I didn't make it clearly,  I mean  we can use the vp_vdpa device id as
>  static struct virtio_device_id id_table[] = {
>  { VIRTIO_ID_NET, VIRTIO_DEV_ANY_ID },
> { 0 },
> since it now only support the net device

This is not what I read from the code:

static int virtio_vdpa_probe(struct vdpa_device *vdpa)
{
...
vd_dev->vdev.id.device = ops->get_device_id(vdpa);
if (vd_dev->vdev.id.device == 0)
goto err;

vd_dev->vdev.id.vendor = ops->get_vendor_id(vdpa);
ret = register_virtio_device(&vd_dev->vdev);
...
}

?

Thanks

>  and This will make the vp_vdpa work.
>
> Thanks
> cindy
>
> > Thanks
> >
> > > Thanks
> > > cindy
> > >
> > >
> > >
> > > > >
> > > > >
> > > > >>>  if (nla_put_u64_64bit(msg, 
> > > > >>> VDPA_ATTR_MGMTDEV_SUPPORTED_CLASSES,
> > > > >>> --
> > > > >>> 2.34.1
> > > > >>>
> > > >
> > >
> >
>

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH V3 6/9] virtio-ccw: implement synchronize_cbs()

2022-04-27 Thread Michael S. Tsirkin
On Thu, Apr 28, 2022 at 04:43:15AM +0200, Halil Pasic wrote:
> On Wed, 27 Apr 2022 11:27:03 +0200
> Cornelia Huck  wrote:
> 
> > On Tue, Apr 26 2022, "Michael S. Tsirkin"  wrote:
> > 
> > > On Tue, Apr 26, 2022 at 05:47:17PM +0200, Cornelia Huck wrote:  
> > >> On Mon, Apr 25 2022, "Michael S. Tsirkin"  wrote:
> > >>   
> > >> > On Mon, Apr 25, 2022 at 11:53:24PM -0400, Michael S. Tsirkin wrote:  
> > >> >> On Tue, Apr 26, 2022 at 11:42:45AM +0800, Jason Wang wrote:  
> > >> >> > 
> > >> >> > 在 2022/4/26 11:38, Michael S. Tsirkin 写道:  
> > >> >> > > On Mon, Apr 25, 2022 at 11:35:41PM -0400, Michael S. Tsirkin 
> > >> >> > > wrote:  
> > >> >> > > > On Tue, Apr 26, 2022 at 04:29:11AM +0200, Halil Pasic wrote:  
> > >> >> > > > > On Mon, 25 Apr 2022 09:59:55 -0400
> > >> >> > > > > "Michael S. Tsirkin"  wrote:
> > >> >> > > > >   
> > >> >> > > > > > On Mon, Apr 25, 2022 at 10:54:24AM +0200, Cornelia Huck 
> > >> >> > > > > > wrote:  
> > >> >> > > > > > > On Mon, Apr 25 2022, "Michael S. Tsirkin" 
> > >> >> > > > > > >  wrote:  
> > >> >> > > > > > > > On Mon, Apr 25, 2022 at 10:44:15AM +0800, Jason Wang 
> > >> >> > > > > > > > wrote:  
> > >> >> > > > > > > > > This patch tries to implement the synchronize_cbs() 
> > >> >> > > > > > > > > for ccw. For the
> > >> >> > > > > > > > > vring_interrupt() that is called via 
> > >> >> > > > > > > > > virtio_airq_handler(), the
> > >> >> > > > > > > > > synchronization is simply done via the airq_info's 
> > >> >> > > > > > > > > lock. For the
> > >> >> > > > > > > > > vring_interrupt() that is called via 
> > >> >> > > > > > > > > virtio_ccw_int_handler(), a per
> > >> >> > > > > > > > > device spinlock for irq is introduced ans used in the 
> > >> >> > > > > > > > > synchronization
> > >> >> > > > > > > > > method.
> > >> >> > > > > > > > > 
> > >> >> > > > > > > > > Cc: Thomas Gleixner 
> > >> >> > > > > > > > > Cc: Peter Zijlstra 
> > >> >> > > > > > > > > Cc: "Paul E. McKenney" 
> > >> >> > > > > > > > > Cc: Marc Zyngier 
> > >> >> > > > > > > > > Cc: Halil Pasic 
> > >> >> > > > > > > > > Cc: Cornelia Huck 
> > >> >> > > > > > > > > Signed-off-by: Jason Wang   
> > >> >> > > > > > > > 
> > >> >> > > > > > > > This is the only one that is giving me pause. Halil, 
> > >> >> > > > > > > > Cornelia,
> > >> >> > > > > > > > should we be concerned about the performance impact 
> > >> >> > > > > > > > here?
> > >> >> > > > > > > > Any chance it can be tested?  
> > >> >> > > > > > > We can have a bunch of devices using the same airq 
> > >> >> > > > > > > structure, and the
> > >> >> > > > > > > sync cb creates a choke point, same as 
> > >> >> > > > > > > registering/unregistering.  
> > >> >> > > > > > BTW can callbacks for multiple VQs run on multiple CPUs at 
> > >> >> > > > > > the moment?  
> > >> >> > > > > I'm not sure I understand the question.
> > >> >> > > > > 
> > >> >> > > > > I do think we can have multiple CPUs that are executing some 
> > >> >> > > > > portion of
> > >> >> > > > > virtio_ccw_int_handler(). So I guess the answer is yes. 
> > >> >> > > > > Connie what do you think?
> > >> >> > > > > 
> > >> >> > > > > On the other hand we could also end up serializing 
> > >> >> > > > > synchronize_cbs()
> > >> >> > > > > calls for different devices if they happen to use the same 
> > >> >> > > > > airq_info. But
> > >> >> > > > > this probably was not your question  
> > >> >> > > > 
> > >> >> > > > I am less concerned about  synchronize_cbs being slow and more 
> > >> >> > > > about
> > >> >> > > > the slowdown in interrupt processing itself.
> > >> >> > > >   
> > >> >> > > > > > this patch serializes them on a spinlock.
> > >> >> > > > > >   
> > >> >> > > > > Those could then pile up on the newly introduced spinlock.  
> > >> 
> > >> How bad would that be in practice? IIUC, we hit on the spinlock when
> > >> - doing synchronize_cbs (should be rare)
> > >> - processing queue interrupts for devices using per-device indicators
> > >>   (which is the non-preferred path, which I would basically only expect
> > >>   when running on an ancient or non-standard hypervisor)  
> > >
> > > this one is my concern. I am worried serializing everything on a single 
> > > lock
> > > will drastically regress performance here.  
> > 
> > Yeah, that case could get much worse. OTOH, how likely is it that any
> > setup that runs a recent kernel will actually end up with devices using
> > per-device indicators? Anything running under a QEMU released in the
> > last couple of years is unlikely to not use airqs, I think. Halil, do
> > you think that the classic indicator setup would be more common on any
> > non-QEMU hypervisors?
> > 
> 
> I really don't know. My opinion is that, two stages indicators are kind
> of recommended for anybody who cares about notifications performance.
> 
> > IOW, how much effort is it worth spending on optimizing this case? We
> > certainly should explore any simple solutions, but I don't think we need
> > to twist ourselves into pretzels t

Re: [PATCH V3 6/9] virtio-ccw: implement synchronize_cbs()

2022-04-27 Thread Michael S. Tsirkin
On Thu, Apr 28, 2022 at 11:04:41AM +0800, Jason Wang wrote:
> > But my guess is that rwlock + some testing for the legacy indicator case
> > just to double check if there is a heavy regression despite of our
> > expectations to see none should do the trick.
> 
> I suggest this, rwlock (for not airq) seems better than spinlock, but
> at worst case it will cause cache line bouncing. But I wonder if it's
> noticeable (anyhow it has been used for airq).
> 
> Thanks

Which existing rwlock does airq use right now? Can we take it to sync?

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH V3 6/9] virtio-ccw: implement synchronize_cbs()

2022-04-27 Thread Jason Wang
On Thu, Apr 28, 2022 at 1:24 PM Michael S. Tsirkin  wrote:
>
> On Thu, Apr 28, 2022 at 11:04:41AM +0800, Jason Wang wrote:
> > > But my guess is that rwlock + some testing for the legacy indicator case
> > > just to double check if there is a heavy regression despite of our
> > > expectations to see none should do the trick.
> >
> > I suggest this, rwlock (for not airq) seems better than spinlock, but
> > at worst case it will cause cache line bouncing. But I wonder if it's
> > noticeable (anyhow it has been used for airq).
> >
> > Thanks
>
> Which existing rwlock does airq use right now? Can we take it to sync?

It's the rwlock in airq_info, it has already been used in this patch.

write_lock(&info->lock);
write_unlock(&info->lock);

But the problem is, it looks to me there could be a case that airq is
not used, (virtio_ccw_int_hander()). That's why the patch use a
spinlock, it could be optimized with using a rwlock as well.

Thanks

>
> --
> MST
>

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH V3 6/9] virtio-ccw: implement synchronize_cbs()

2022-04-27 Thread Michael S. Tsirkin
On Thu, Apr 28, 2022 at 01:51:59PM +0800, Jason Wang wrote:
> On Thu, Apr 28, 2022 at 1:24 PM Michael S. Tsirkin  wrote:
> >
> > On Thu, Apr 28, 2022 at 11:04:41AM +0800, Jason Wang wrote:
> > > > But my guess is that rwlock + some testing for the legacy indicator case
> > > > just to double check if there is a heavy regression despite of our
> > > > expectations to see none should do the trick.
> > >
> > > I suggest this, rwlock (for not airq) seems better than spinlock, but
> > > at worst case it will cause cache line bouncing. But I wonder if it's
> > > noticeable (anyhow it has been used for airq).
> > >
> > > Thanks
> >
> > Which existing rwlock does airq use right now? Can we take it to sync?
> 
> It's the rwlock in airq_info, it has already been used in this patch.
> 
> write_lock(&info->lock);
> write_unlock(&info->lock);
> 
> But the problem is, it looks to me there could be a case that airq is
> not used, (virtio_ccw_int_hander()). That's why the patch use a
> spinlock, it could be optimized with using a rwlock as well.
> 
> Thanks

Ah, right. So let's take that on the legacy path too and Halil promises
to test to make sure performance isn't impacted too badly?

> >
> > --
> > MST
> >

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH V3 6/9] virtio-ccw: implement synchronize_cbs()

2022-04-27 Thread Jason Wang
On Thu, Apr 28, 2022 at 1:55 PM Michael S. Tsirkin  wrote:
>
> On Thu, Apr 28, 2022 at 01:51:59PM +0800, Jason Wang wrote:
> > On Thu, Apr 28, 2022 at 1:24 PM Michael S. Tsirkin  wrote:
> > >
> > > On Thu, Apr 28, 2022 at 11:04:41AM +0800, Jason Wang wrote:
> > > > > But my guess is that rwlock + some testing for the legacy indicator 
> > > > > case
> > > > > just to double check if there is a heavy regression despite of our
> > > > > expectations to see none should do the trick.
> > > >
> > > > I suggest this, rwlock (for not airq) seems better than spinlock, but
> > > > at worst case it will cause cache line bouncing. But I wonder if it's
> > > > noticeable (anyhow it has been used for airq).
> > > >
> > > > Thanks
> > >
> > > Which existing rwlock does airq use right now? Can we take it to sync?
> >
> > It's the rwlock in airq_info, it has already been used in this patch.
> >
> > write_lock(&info->lock);
> > write_unlock(&info->lock);
> >
> > But the problem is, it looks to me there could be a case that airq is
> > not used, (virtio_ccw_int_hander()). That's why the patch use a
> > spinlock, it could be optimized with using a rwlock as well.
> >
> > Thanks
>
> Ah, right. So let's take that on the legacy path too and Halil promises
> to test to make sure performance isn't impacted too badly?

I think what you meant is using a dedicated rwlock instead of trying
to reuse one of the airq_info locks.

If this is true, it should be fine.

Thanks

>
> > >
> > > --
> > > MST
> > >
>

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH V3 6/9] virtio-ccw: implement synchronize_cbs()

2022-04-27 Thread Michael S. Tsirkin
On Thu, Apr 28, 2022 at 02:02:16PM +0800, Jason Wang wrote:
> On Thu, Apr 28, 2022 at 1:55 PM Michael S. Tsirkin  wrote:
> >
> > On Thu, Apr 28, 2022 at 01:51:59PM +0800, Jason Wang wrote:
> > > On Thu, Apr 28, 2022 at 1:24 PM Michael S. Tsirkin  
> > > wrote:
> > > >
> > > > On Thu, Apr 28, 2022 at 11:04:41AM +0800, Jason Wang wrote:
> > > > > > But my guess is that rwlock + some testing for the legacy indicator 
> > > > > > case
> > > > > > just to double check if there is a heavy regression despite of our
> > > > > > expectations to see none should do the trick.
> > > > >
> > > > > I suggest this, rwlock (for not airq) seems better than spinlock, but
> > > > > at worst case it will cause cache line bouncing. But I wonder if it's
> > > > > noticeable (anyhow it has been used for airq).
> > > > >
> > > > > Thanks
> > > >
> > > > Which existing rwlock does airq use right now? Can we take it to sync?
> > >
> > > It's the rwlock in airq_info, it has already been used in this patch.
> > >
> > > write_lock(&info->lock);
> > > write_unlock(&info->lock);
> > >
> > > But the problem is, it looks to me there could be a case that airq is
> > > not used, (virtio_ccw_int_hander()). That's why the patch use a
> > > spinlock, it could be optimized with using a rwlock as well.
> > >
> > > Thanks
> >
> > Ah, right. So let's take that on the legacy path too and Halil promises
> > to test to make sure performance isn't impacted too badly?
> 
> I think what you meant is using a dedicated rwlock instead of trying
> to reuse one of the airq_info locks.
> 
> If this is true, it should be fine.
> 
> Thanks

yes

> >
> > > >
> > > > --
> > > > MST
> > > >
> >

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization