Re: [PATCH v4] Use io_uring_register_ring_fd() to skip fd operations

2022-05-30 Thread Stefan Hajnoczi
On Fri, 22 Apr 2022 at 16:40, olc  wrote:
>
> Hi Stefan,
> I've tested the code and it behaves as you expected. Should I add this to a 
> new patch version or leave it as is?

Hi Sam,
Sorry I missed this email. Please send a new version of the patch with
CONFIG_LIBURING_REGISTER_RING_FD.

Thanks,
Stefan



Re: [PATCH v4] Use io_uring_register_ring_fd() to skip fd operations

2022-04-22 Thread olc
Hi Stefan,
I've tested the code and it behaves as you expected. Should I add this to a
new patch version or leave it as is?

Sam

Stefan Hajnoczi  于2022年4月22日周五 23:10写道:

> On Fri, Apr 22, 2022 at 12:36:49AM +0800, Sam Li wrote:
> > Linux recently added a new io_uring(7) optimization API that QEMU
> > doesn't take advantage of yet. The liburing library that QEMU uses
> > has added a corresponding new API calling io_uring_register_ring_fd().
> > When this API is called after creating the ring, the io_uring_submit()
> > library function passes a flag to the io_uring_enter(2) syscall
> > allowing it to skip the ring file descriptor fdget()/fdput()
> > operations. This saves some CPU cycles.
> >
> > Signed-off-by: Sam Li 
> > ---
> >  block/io_uring.c | 10 +-
> >  1 file changed, 9 insertions(+), 1 deletion(-)
> >
> > diff --git a/block/io_uring.c b/block/io_uring.c
> > index 782afdb433..5247fb79e2 100644
> > --- a/block/io_uring.c
> > +++ b/block/io_uring.c
> > @@ -435,8 +435,16 @@ LuringState *luring_init(Error **errp)
> >  }
> >
> >  ioq_init(&s->io_q);
> > -return s;
> > +if (io_uring_register_ring_fd(&s->ring) < 0) {
>
> What happens when QEMU is built against an older version of liburing
> that lacks the io_uring_register_ring_fd() API?
>
> I guess there will be a compiler error because the function prototype is
> missing in .
>
> This can be addressed by checking for the presence of the function in
> meson.build:
>
> +config_host_data.set('CONFIG_LIBURING_REGISTER_RING_FD',
> cc.has_function('io_uring_register_ring_fd', prefix: '#include
> '))
>
> Then block/io_uring.c can call the function only when available:
>
> +#ifdef CONFIG_LIBURING_REGISTER_RING_FD
> +io_uring_register_ring_fd(&s->ring);
> +#endif
>
> (I haven't tested this code but it should be close.)
>
> Stefan
>


Re: [PATCH v4] Use io_uring_register_ring_fd() to skip fd operations

2022-04-22 Thread Stefan Hajnoczi
On Fri, Apr 22, 2022 at 12:36:49AM +0800, Sam Li wrote:
> Linux recently added a new io_uring(7) optimization API that QEMU
> doesn't take advantage of yet. The liburing library that QEMU uses
> has added a corresponding new API calling io_uring_register_ring_fd().
> When this API is called after creating the ring, the io_uring_submit()
> library function passes a flag to the io_uring_enter(2) syscall
> allowing it to skip the ring file descriptor fdget()/fdput()
> operations. This saves some CPU cycles.
> 
> Signed-off-by: Sam Li 
> ---
>  block/io_uring.c | 10 +-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/block/io_uring.c b/block/io_uring.c
> index 782afdb433..5247fb79e2 100644
> --- a/block/io_uring.c
> +++ b/block/io_uring.c
> @@ -435,8 +435,16 @@ LuringState *luring_init(Error **errp)
>  }
>  
>  ioq_init(&s->io_q);
> -return s;
> +if (io_uring_register_ring_fd(&s->ring) < 0) {

What happens when QEMU is built against an older version of liburing
that lacks the io_uring_register_ring_fd() API?

I guess there will be a compiler error because the function prototype is
missing in .

This can be addressed by checking for the presence of the function in
meson.build:

+config_host_data.set('CONFIG_LIBURING_REGISTER_RING_FD', 
cc.has_function('io_uring_register_ring_fd', prefix: '#include '))

Then block/io_uring.c can call the function only when available:

+#ifdef CONFIG_LIBURING_REGISTER_RING_FD
+io_uring_register_ring_fd(&s->ring);
+#endif

(I haven't tested this code but it should be close.)

Stefan


signature.asc
Description: PGP signature


Re: [PATCH v4] Use io_uring_register_ring_fd() to skip fd operations

2022-04-22 Thread Stefan Hajnoczi
On Fri, Apr 22, 2022 at 11:08:39AM +0100, Daniel P. Berrangé wrote:
> On Fri, Apr 22, 2022 at 11:00:47AM +0100, Fam Zheng wrote:
> > On 2022-04-22 09:52, Daniel P. Berrangé wrote:
> > > On Fri, Apr 22, 2022 at 09:34:28AM +0100, Fam Zheng wrote:
> > > > On 2022-04-22 00:36, Sam Li wrote:
> > > > > Linux recently added a new io_uring(7) optimization API that QEMU
> > > > > doesn't take advantage of yet. The liburing library that QEMU uses
> > > > > has added a corresponding new API calling io_uring_register_ring_fd().
> > > > > When this API is called after creating the ring, the io_uring_submit()
> > > > > library function passes a flag to the io_uring_enter(2) syscall
> > > > > allowing it to skip the ring file descriptor fdget()/fdput()
> > > > > operations. This saves some CPU cycles.
> > > > > 
> > > > > Signed-off-by: Sam Li 
> > > > > ---
> > > > >  block/io_uring.c | 10 +-
> > > > >  1 file changed, 9 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/block/io_uring.c b/block/io_uring.c
> > > > > index 782afdb433..5247fb79e2 100644
> > > > > --- a/block/io_uring.c
> > > > > +++ b/block/io_uring.c
> > > > > @@ -435,8 +435,16 @@ LuringState *luring_init(Error **errp)
> > > > >  }
> > > > >  
> > > > >  ioq_init(&s->io_q);
> > > > > -return s;
> > > > > +if (io_uring_register_ring_fd(&s->ring) < 0) {
> > > > > +/*
> > > > > + * Only warn about this error: we will fallback to the 
> > > > > non-optimized
> > > > > + * io_uring operations.
> > > > > + */
> > > > > +error_reportf_err(*errp,
> > > > > + "failed to register linux io_uring ring 
> > > > > file descriptor");
> > > > 
> > > > IIUC errp can be NULL, so let's not dereference it without checking. 
> > > > So, just
> > > > use error_report?
> > > 
> > > Plenty of people will be running kernels that lack the new feature,
> > > so this "failure" will be an expected scenario. We shouldn't be
> > > spamming the logs with any error or warning message. Assuming  QEMU
> > > remains fully functional, merely not as optimized, we should be
> > > totally silent.
> > 
> > Functionally, that's a very valid point. But performance wise, is it good to
> > have some visibility of this? Since people use io_uring instead of other
> > options almost certainly for performance, and here the issue does matter 
> > quite
> > a bit.
> 
> IMHO what you describe is largely a documentation issue, and/or something
> for OS vendors to worry about if they want to maximise their users'
> performance. As long as io_uring is fully functional we shouldn't print
> errors on every QEMU startup, as it leads to pointless bug reports/support
> escalations about something that is operating normally, wasting users and
> vendors' time.

Also, this is a minor optimization. It's nice to save a CPU cycles when
possible, but it's probably not significant enough that users would
bother to upgrade their kernel.

I think no warning is necessary.

Stefan


signature.asc
Description: PGP signature


Re: [PATCH v4] Use io_uring_register_ring_fd() to skip fd operations

2022-04-22 Thread Daniel P . Berrangé
On Fri, Apr 22, 2022 at 11:00:47AM +0100, Fam Zheng wrote:
> On 2022-04-22 09:52, Daniel P. Berrangé wrote:
> > On Fri, Apr 22, 2022 at 09:34:28AM +0100, Fam Zheng wrote:
> > > On 2022-04-22 00:36, Sam Li wrote:
> > > > Linux recently added a new io_uring(7) optimization API that QEMU
> > > > doesn't take advantage of yet. The liburing library that QEMU uses
> > > > has added a corresponding new API calling io_uring_register_ring_fd().
> > > > When this API is called after creating the ring, the io_uring_submit()
> > > > library function passes a flag to the io_uring_enter(2) syscall
> > > > allowing it to skip the ring file descriptor fdget()/fdput()
> > > > operations. This saves some CPU cycles.
> > > > 
> > > > Signed-off-by: Sam Li 
> > > > ---
> > > >  block/io_uring.c | 10 +-
> > > >  1 file changed, 9 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/block/io_uring.c b/block/io_uring.c
> > > > index 782afdb433..5247fb79e2 100644
> > > > --- a/block/io_uring.c
> > > > +++ b/block/io_uring.c
> > > > @@ -435,8 +435,16 @@ LuringState *luring_init(Error **errp)
> > > >  }
> > > >  
> > > >  ioq_init(&s->io_q);
> > > > -return s;
> > > > +if (io_uring_register_ring_fd(&s->ring) < 0) {
> > > > +/*
> > > > + * Only warn about this error: we will fallback to the 
> > > > non-optimized
> > > > + * io_uring operations.
> > > > + */
> > > > +error_reportf_err(*errp,
> > > > + "failed to register linux io_uring ring file 
> > > > descriptor");
> > > 
> > > IIUC errp can be NULL, so let's not dereference it without checking. So, 
> > > just
> > > use error_report?
> > 
> > Plenty of people will be running kernels that lack the new feature,
> > so this "failure" will be an expected scenario. We shouldn't be
> > spamming the logs with any error or warning message. Assuming  QEMU
> > remains fully functional, merely not as optimized, we should be
> > totally silent.
> 
> Functionally, that's a very valid point. But performance wise, is it good to
> have some visibility of this? Since people use io_uring instead of other
> options almost certainly for performance, and here the issue does matter quite
> a bit.

IMHO what you describe is largely a documentation issue, and/or something
for OS vendors to worry about if they want to maximise their users'
performance. As long as io_uring is fully functional we shouldn't print
errors on every QEMU startup, as it leads to pointless bug reports/support
escalations about something that is operating normally, wasting users and
vendors' time.

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v4] Use io_uring_register_ring_fd() to skip fd operations

2022-04-22 Thread Fam Zheng
On 2022-04-22 09:52, Daniel P. Berrangé wrote:
> On Fri, Apr 22, 2022 at 09:34:28AM +0100, Fam Zheng wrote:
> > On 2022-04-22 00:36, Sam Li wrote:
> > > Linux recently added a new io_uring(7) optimization API that QEMU
> > > doesn't take advantage of yet. The liburing library that QEMU uses
> > > has added a corresponding new API calling io_uring_register_ring_fd().
> > > When this API is called after creating the ring, the io_uring_submit()
> > > library function passes a flag to the io_uring_enter(2) syscall
> > > allowing it to skip the ring file descriptor fdget()/fdput()
> > > operations. This saves some CPU cycles.
> > > 
> > > Signed-off-by: Sam Li 
> > > ---
> > >  block/io_uring.c | 10 +-
> > >  1 file changed, 9 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/block/io_uring.c b/block/io_uring.c
> > > index 782afdb433..5247fb79e2 100644
> > > --- a/block/io_uring.c
> > > +++ b/block/io_uring.c
> > > @@ -435,8 +435,16 @@ LuringState *luring_init(Error **errp)
> > >  }
> > >  
> > >  ioq_init(&s->io_q);
> > > -return s;
> > > +if (io_uring_register_ring_fd(&s->ring) < 0) {
> > > +/*
> > > + * Only warn about this error: we will fallback to the 
> > > non-optimized
> > > + * io_uring operations.
> > > + */
> > > +error_reportf_err(*errp,
> > > + "failed to register linux io_uring ring file 
> > > descriptor");
> > 
> > IIUC errp can be NULL, so let's not dereference it without checking. So, 
> > just
> > use error_report?
> 
> Plenty of people will be running kernels that lack the new feature,
> so this "failure" will be an expected scenario. We shouldn't be
> spamming the logs with any error or warning message. Assuming  QEMU
> remains fully functional, merely not as optimized, we should be
> totally silent.

Functionally, that's a very valid point. But performance wise, is it good to
have some visibility of this? Since people use io_uring instead of other
options almost certainly for performance, and here the issue does matter quite
a bit.

Fam

> 
> At most stick in a 'trace' point so we can record whether the
> optimization is present.
> 
> With regards,
> Daniel
> -- 
> |: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o-https://fstop138.berrange.com :|
> |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
> 
> 




Re: [PATCH v4] Use io_uring_register_ring_fd() to skip fd operations

2022-04-22 Thread Daniel P . Berrangé
On Fri, Apr 22, 2022 at 09:34:28AM +0100, Fam Zheng wrote:
> On 2022-04-22 00:36, Sam Li wrote:
> > Linux recently added a new io_uring(7) optimization API that QEMU
> > doesn't take advantage of yet. The liburing library that QEMU uses
> > has added a corresponding new API calling io_uring_register_ring_fd().
> > When this API is called after creating the ring, the io_uring_submit()
> > library function passes a flag to the io_uring_enter(2) syscall
> > allowing it to skip the ring file descriptor fdget()/fdput()
> > operations. This saves some CPU cycles.
> > 
> > Signed-off-by: Sam Li 
> > ---
> >  block/io_uring.c | 10 +-
> >  1 file changed, 9 insertions(+), 1 deletion(-)
> > 
> > diff --git a/block/io_uring.c b/block/io_uring.c
> > index 782afdb433..5247fb79e2 100644
> > --- a/block/io_uring.c
> > +++ b/block/io_uring.c
> > @@ -435,8 +435,16 @@ LuringState *luring_init(Error **errp)
> >  }
> >  
> >  ioq_init(&s->io_q);
> > -return s;
> > +if (io_uring_register_ring_fd(&s->ring) < 0) {
> > +/*
> > + * Only warn about this error: we will fallback to the 
> > non-optimized
> > + * io_uring operations.
> > + */
> > +error_reportf_err(*errp,
> > + "failed to register linux io_uring ring file 
> > descriptor");
> 
> IIUC errp can be NULL, so let's not dereference it without checking. So, just
> use error_report?

Plenty of people will be running kernels that lack the new feature,
so this "failure" will be an expected scenario. We shouldn't be
spamming the logs with any error or warning message. Assuming  QEMU
remains fully functional, merely not as optimized, we should be
totally silent.

At most stick in a 'trace' point so we can record whether the
optimization is present.

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v4] Use io_uring_register_ring_fd() to skip fd operations

2022-04-22 Thread Fam Zheng
On 2022-04-22 00:36, Sam Li wrote:
> Linux recently added a new io_uring(7) optimization API that QEMU
> doesn't take advantage of yet. The liburing library that QEMU uses
> has added a corresponding new API calling io_uring_register_ring_fd().
> When this API is called after creating the ring, the io_uring_submit()
> library function passes a flag to the io_uring_enter(2) syscall
> allowing it to skip the ring file descriptor fdget()/fdput()
> operations. This saves some CPU cycles.
> 
> Signed-off-by: Sam Li 
> ---
>  block/io_uring.c | 10 +-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/block/io_uring.c b/block/io_uring.c
> index 782afdb433..5247fb79e2 100644
> --- a/block/io_uring.c
> +++ b/block/io_uring.c
> @@ -435,8 +435,16 @@ LuringState *luring_init(Error **errp)
>  }
>  
>  ioq_init(&s->io_q);
> -return s;
> +if (io_uring_register_ring_fd(&s->ring) < 0) {
> +/*
> + * Only warn about this error: we will fallback to the non-optimized
> + * io_uring operations.
> + */
> +error_reportf_err(*errp,
> + "failed to register linux io_uring ring file 
> descriptor");

IIUC errp can be NULL, so let's not dereference it without checking. So, just
use error_report?

Fam

> +}
>  
> +return s;
>  }
>  
>  void luring_cleanup(LuringState *s)
> -- 
> Use error_reportf_err to avoid memory leak due to not freeing error
> object.
> --
> 2.35.1
> 
> 



[PATCH v4] Use io_uring_register_ring_fd() to skip fd operations

2022-04-21 Thread Sam Li
Linux recently added a new io_uring(7) optimization API that QEMU
doesn't take advantage of yet. The liburing library that QEMU uses
has added a corresponding new API calling io_uring_register_ring_fd().
When this API is called after creating the ring, the io_uring_submit()
library function passes a flag to the io_uring_enter(2) syscall
allowing it to skip the ring file descriptor fdget()/fdput()
operations. This saves some CPU cycles.

Signed-off-by: Sam Li 
---
 block/io_uring.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/block/io_uring.c b/block/io_uring.c
index 782afdb433..5247fb79e2 100644
--- a/block/io_uring.c
+++ b/block/io_uring.c
@@ -435,8 +435,16 @@ LuringState *luring_init(Error **errp)
 }
 
 ioq_init(&s->io_q);
-return s;
+if (io_uring_register_ring_fd(&s->ring) < 0) {
+/*
+ * Only warn about this error: we will fallback to the non-optimized
+ * io_uring operations.
+ */
+error_reportf_err(*errp,
+ "failed to register linux io_uring ring file 
descriptor");
+}
 
+return s;
 }
 
 void luring_cleanup(LuringState *s)
-- 
Use error_reportf_err to avoid memory leak due to not freeing error
object.
--
2.35.1