Re: [PATCH v4] Use io_uring_register_ring_fd() to skip fd operations
On Fri, 22 Apr 2022 at 16:40, olc wrote: > > Hi Stefan, > I've tested the code and it behaves as you expected. Should I add this to a > new patch version or leave it as is? Hi Sam, Sorry I missed this email. Please send a new version of the patch with CONFIG_LIBURING_REGISTER_RING_FD. Thanks, Stefan
Re: [PATCH v4] Use io_uring_register_ring_fd() to skip fd operations
Hi Stefan, I've tested the code and it behaves as you expected. Should I add this to a new patch version or leave it as is? Sam Stefan Hajnoczi 于2022年4月22日周五 23:10写道: > On Fri, Apr 22, 2022 at 12:36:49AM +0800, Sam Li wrote: > > Linux recently added a new io_uring(7) optimization API that QEMU > > doesn't take advantage of yet. The liburing library that QEMU uses > > has added a corresponding new API calling io_uring_register_ring_fd(). > > When this API is called after creating the ring, the io_uring_submit() > > library function passes a flag to the io_uring_enter(2) syscall > > allowing it to skip the ring file descriptor fdget()/fdput() > > operations. This saves some CPU cycles. > > > > Signed-off-by: Sam Li > > --- > > block/io_uring.c | 10 +- > > 1 file changed, 9 insertions(+), 1 deletion(-) > > > > diff --git a/block/io_uring.c b/block/io_uring.c > > index 782afdb433..5247fb79e2 100644 > > --- a/block/io_uring.c > > +++ b/block/io_uring.c > > @@ -435,8 +435,16 @@ LuringState *luring_init(Error **errp) > > } > > > > ioq_init(&s->io_q); > > -return s; > > +if (io_uring_register_ring_fd(&s->ring) < 0) { > > What happens when QEMU is built against an older version of liburing > that lacks the io_uring_register_ring_fd() API? > > I guess there will be a compiler error because the function prototype is > missing in . > > This can be addressed by checking for the presence of the function in > meson.build: > > +config_host_data.set('CONFIG_LIBURING_REGISTER_RING_FD', > cc.has_function('io_uring_register_ring_fd', prefix: '#include > ')) > > Then block/io_uring.c can call the function only when available: > > +#ifdef CONFIG_LIBURING_REGISTER_RING_FD > +io_uring_register_ring_fd(&s->ring); > +#endif > > (I haven't tested this code but it should be close.) > > Stefan >
Re: [PATCH v4] Use io_uring_register_ring_fd() to skip fd operations
On Fri, Apr 22, 2022 at 12:36:49AM +0800, Sam Li wrote: > Linux recently added a new io_uring(7) optimization API that QEMU > doesn't take advantage of yet. The liburing library that QEMU uses > has added a corresponding new API calling io_uring_register_ring_fd(). > When this API is called after creating the ring, the io_uring_submit() > library function passes a flag to the io_uring_enter(2) syscall > allowing it to skip the ring file descriptor fdget()/fdput() > operations. This saves some CPU cycles. > > Signed-off-by: Sam Li > --- > block/io_uring.c | 10 +- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/block/io_uring.c b/block/io_uring.c > index 782afdb433..5247fb79e2 100644 > --- a/block/io_uring.c > +++ b/block/io_uring.c > @@ -435,8 +435,16 @@ LuringState *luring_init(Error **errp) > } > > ioq_init(&s->io_q); > -return s; > +if (io_uring_register_ring_fd(&s->ring) < 0) { What happens when QEMU is built against an older version of liburing that lacks the io_uring_register_ring_fd() API? I guess there will be a compiler error because the function prototype is missing in . This can be addressed by checking for the presence of the function in meson.build: +config_host_data.set('CONFIG_LIBURING_REGISTER_RING_FD', cc.has_function('io_uring_register_ring_fd', prefix: '#include ')) Then block/io_uring.c can call the function only when available: +#ifdef CONFIG_LIBURING_REGISTER_RING_FD +io_uring_register_ring_fd(&s->ring); +#endif (I haven't tested this code but it should be close.) Stefan signature.asc Description: PGP signature
Re: [PATCH v4] Use io_uring_register_ring_fd() to skip fd operations
On Fri, Apr 22, 2022 at 11:08:39AM +0100, Daniel P. Berrangé wrote: > On Fri, Apr 22, 2022 at 11:00:47AM +0100, Fam Zheng wrote: > > On 2022-04-22 09:52, Daniel P. Berrangé wrote: > > > On Fri, Apr 22, 2022 at 09:34:28AM +0100, Fam Zheng wrote: > > > > On 2022-04-22 00:36, Sam Li wrote: > > > > > Linux recently added a new io_uring(7) optimization API that QEMU > > > > > doesn't take advantage of yet. The liburing library that QEMU uses > > > > > has added a corresponding new API calling io_uring_register_ring_fd(). > > > > > When this API is called after creating the ring, the io_uring_submit() > > > > > library function passes a flag to the io_uring_enter(2) syscall > > > > > allowing it to skip the ring file descriptor fdget()/fdput() > > > > > operations. This saves some CPU cycles. > > > > > > > > > > Signed-off-by: Sam Li > > > > > --- > > > > > block/io_uring.c | 10 +- > > > > > 1 file changed, 9 insertions(+), 1 deletion(-) > > > > > > > > > > diff --git a/block/io_uring.c b/block/io_uring.c > > > > > index 782afdb433..5247fb79e2 100644 > > > > > --- a/block/io_uring.c > > > > > +++ b/block/io_uring.c > > > > > @@ -435,8 +435,16 @@ LuringState *luring_init(Error **errp) > > > > > } > > > > > > > > > > ioq_init(&s->io_q); > > > > > -return s; > > > > > +if (io_uring_register_ring_fd(&s->ring) < 0) { > > > > > +/* > > > > > + * Only warn about this error: we will fallback to the > > > > > non-optimized > > > > > + * io_uring operations. > > > > > + */ > > > > > +error_reportf_err(*errp, > > > > > + "failed to register linux io_uring ring > > > > > file descriptor"); > > > > > > > > IIUC errp can be NULL, so let's not dereference it without checking. > > > > So, just > > > > use error_report? > > > > > > Plenty of people will be running kernels that lack the new feature, > > > so this "failure" will be an expected scenario. We shouldn't be > > > spamming the logs with any error or warning message. Assuming QEMU > > > remains fully functional, merely not as optimized, we should be > > > totally silent. > > > > Functionally, that's a very valid point. But performance wise, is it good to > > have some visibility of this? Since people use io_uring instead of other > > options almost certainly for performance, and here the issue does matter > > quite > > a bit. > > IMHO what you describe is largely a documentation issue, and/or something > for OS vendors to worry about if they want to maximise their users' > performance. As long as io_uring is fully functional we shouldn't print > errors on every QEMU startup, as it leads to pointless bug reports/support > escalations about something that is operating normally, wasting users and > vendors' time. Also, this is a minor optimization. It's nice to save a CPU cycles when possible, but it's probably not significant enough that users would bother to upgrade their kernel. I think no warning is necessary. Stefan signature.asc Description: PGP signature
Re: [PATCH v4] Use io_uring_register_ring_fd() to skip fd operations
On Fri, Apr 22, 2022 at 11:00:47AM +0100, Fam Zheng wrote: > On 2022-04-22 09:52, Daniel P. Berrangé wrote: > > On Fri, Apr 22, 2022 at 09:34:28AM +0100, Fam Zheng wrote: > > > On 2022-04-22 00:36, Sam Li wrote: > > > > Linux recently added a new io_uring(7) optimization API that QEMU > > > > doesn't take advantage of yet. The liburing library that QEMU uses > > > > has added a corresponding new API calling io_uring_register_ring_fd(). > > > > When this API is called after creating the ring, the io_uring_submit() > > > > library function passes a flag to the io_uring_enter(2) syscall > > > > allowing it to skip the ring file descriptor fdget()/fdput() > > > > operations. This saves some CPU cycles. > > > > > > > > Signed-off-by: Sam Li > > > > --- > > > > block/io_uring.c | 10 +- > > > > 1 file changed, 9 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/block/io_uring.c b/block/io_uring.c > > > > index 782afdb433..5247fb79e2 100644 > > > > --- a/block/io_uring.c > > > > +++ b/block/io_uring.c > > > > @@ -435,8 +435,16 @@ LuringState *luring_init(Error **errp) > > > > } > > > > > > > > ioq_init(&s->io_q); > > > > -return s; > > > > +if (io_uring_register_ring_fd(&s->ring) < 0) { > > > > +/* > > > > + * Only warn about this error: we will fallback to the > > > > non-optimized > > > > + * io_uring operations. > > > > + */ > > > > +error_reportf_err(*errp, > > > > + "failed to register linux io_uring ring file > > > > descriptor"); > > > > > > IIUC errp can be NULL, so let's not dereference it without checking. So, > > > just > > > use error_report? > > > > Plenty of people will be running kernels that lack the new feature, > > so this "failure" will be an expected scenario. We shouldn't be > > spamming the logs with any error or warning message. Assuming QEMU > > remains fully functional, merely not as optimized, we should be > > totally silent. > > Functionally, that's a very valid point. But performance wise, is it good to > have some visibility of this? Since people use io_uring instead of other > options almost certainly for performance, and here the issue does matter quite > a bit. IMHO what you describe is largely a documentation issue, and/or something for OS vendors to worry about if they want to maximise their users' performance. As long as io_uring is fully functional we shouldn't print errors on every QEMU startup, as it leads to pointless bug reports/support escalations about something that is operating normally, wasting users and vendors' time. With regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [PATCH v4] Use io_uring_register_ring_fd() to skip fd operations
On 2022-04-22 09:52, Daniel P. Berrangé wrote: > On Fri, Apr 22, 2022 at 09:34:28AM +0100, Fam Zheng wrote: > > On 2022-04-22 00:36, Sam Li wrote: > > > Linux recently added a new io_uring(7) optimization API that QEMU > > > doesn't take advantage of yet. The liburing library that QEMU uses > > > has added a corresponding new API calling io_uring_register_ring_fd(). > > > When this API is called after creating the ring, the io_uring_submit() > > > library function passes a flag to the io_uring_enter(2) syscall > > > allowing it to skip the ring file descriptor fdget()/fdput() > > > operations. This saves some CPU cycles. > > > > > > Signed-off-by: Sam Li > > > --- > > > block/io_uring.c | 10 +- > > > 1 file changed, 9 insertions(+), 1 deletion(-) > > > > > > diff --git a/block/io_uring.c b/block/io_uring.c > > > index 782afdb433..5247fb79e2 100644 > > > --- a/block/io_uring.c > > > +++ b/block/io_uring.c > > > @@ -435,8 +435,16 @@ LuringState *luring_init(Error **errp) > > > } > > > > > > ioq_init(&s->io_q); > > > -return s; > > > +if (io_uring_register_ring_fd(&s->ring) < 0) { > > > +/* > > > + * Only warn about this error: we will fallback to the > > > non-optimized > > > + * io_uring operations. > > > + */ > > > +error_reportf_err(*errp, > > > + "failed to register linux io_uring ring file > > > descriptor"); > > > > IIUC errp can be NULL, so let's not dereference it without checking. So, > > just > > use error_report? > > Plenty of people will be running kernels that lack the new feature, > so this "failure" will be an expected scenario. We shouldn't be > spamming the logs with any error or warning message. Assuming QEMU > remains fully functional, merely not as optimized, we should be > totally silent. Functionally, that's a very valid point. But performance wise, is it good to have some visibility of this? Since people use io_uring instead of other options almost certainly for performance, and here the issue does matter quite a bit. Fam > > At most stick in a 'trace' point so we can record whether the > optimization is present. > > With regards, > Daniel > -- > |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o-https://fstop138.berrange.com :| > |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :| > >
Re: [PATCH v4] Use io_uring_register_ring_fd() to skip fd operations
On Fri, Apr 22, 2022 at 09:34:28AM +0100, Fam Zheng wrote: > On 2022-04-22 00:36, Sam Li wrote: > > Linux recently added a new io_uring(7) optimization API that QEMU > > doesn't take advantage of yet. The liburing library that QEMU uses > > has added a corresponding new API calling io_uring_register_ring_fd(). > > When this API is called after creating the ring, the io_uring_submit() > > library function passes a flag to the io_uring_enter(2) syscall > > allowing it to skip the ring file descriptor fdget()/fdput() > > operations. This saves some CPU cycles. > > > > Signed-off-by: Sam Li > > --- > > block/io_uring.c | 10 +- > > 1 file changed, 9 insertions(+), 1 deletion(-) > > > > diff --git a/block/io_uring.c b/block/io_uring.c > > index 782afdb433..5247fb79e2 100644 > > --- a/block/io_uring.c > > +++ b/block/io_uring.c > > @@ -435,8 +435,16 @@ LuringState *luring_init(Error **errp) > > } > > > > ioq_init(&s->io_q); > > -return s; > > +if (io_uring_register_ring_fd(&s->ring) < 0) { > > +/* > > + * Only warn about this error: we will fallback to the > > non-optimized > > + * io_uring operations. > > + */ > > +error_reportf_err(*errp, > > + "failed to register linux io_uring ring file > > descriptor"); > > IIUC errp can be NULL, so let's not dereference it without checking. So, just > use error_report? Plenty of people will be running kernels that lack the new feature, so this "failure" will be an expected scenario. We shouldn't be spamming the logs with any error or warning message. Assuming QEMU remains fully functional, merely not as optimized, we should be totally silent. At most stick in a 'trace' point so we can record whether the optimization is present. With regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [PATCH v4] Use io_uring_register_ring_fd() to skip fd operations
On 2022-04-22 00:36, Sam Li wrote: > Linux recently added a new io_uring(7) optimization API that QEMU > doesn't take advantage of yet. The liburing library that QEMU uses > has added a corresponding new API calling io_uring_register_ring_fd(). > When this API is called after creating the ring, the io_uring_submit() > library function passes a flag to the io_uring_enter(2) syscall > allowing it to skip the ring file descriptor fdget()/fdput() > operations. This saves some CPU cycles. > > Signed-off-by: Sam Li > --- > block/io_uring.c | 10 +- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/block/io_uring.c b/block/io_uring.c > index 782afdb433..5247fb79e2 100644 > --- a/block/io_uring.c > +++ b/block/io_uring.c > @@ -435,8 +435,16 @@ LuringState *luring_init(Error **errp) > } > > ioq_init(&s->io_q); > -return s; > +if (io_uring_register_ring_fd(&s->ring) < 0) { > +/* > + * Only warn about this error: we will fallback to the non-optimized > + * io_uring operations. > + */ > +error_reportf_err(*errp, > + "failed to register linux io_uring ring file > descriptor"); IIUC errp can be NULL, so let's not dereference it without checking. So, just use error_report? Fam > +} > > +return s; > } > > void luring_cleanup(LuringState *s) > -- > Use error_reportf_err to avoid memory leak due to not freeing error > object. > -- > 2.35.1 > >
[PATCH v4] Use io_uring_register_ring_fd() to skip fd operations
Linux recently added a new io_uring(7) optimization API that QEMU doesn't take advantage of yet. The liburing library that QEMU uses has added a corresponding new API calling io_uring_register_ring_fd(). When this API is called after creating the ring, the io_uring_submit() library function passes a flag to the io_uring_enter(2) syscall allowing it to skip the ring file descriptor fdget()/fdput() operations. This saves some CPU cycles. Signed-off-by: Sam Li --- block/io_uring.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/block/io_uring.c b/block/io_uring.c index 782afdb433..5247fb79e2 100644 --- a/block/io_uring.c +++ b/block/io_uring.c @@ -435,8 +435,16 @@ LuringState *luring_init(Error **errp) } ioq_init(&s->io_q); -return s; +if (io_uring_register_ring_fd(&s->ring) < 0) { +/* + * Only warn about this error: we will fallback to the non-optimized + * io_uring operations. + */ +error_reportf_err(*errp, + "failed to register linux io_uring ring file descriptor"); +} +return s; } void luring_cleanup(LuringState *s) -- Use error_reportf_err to avoid memory leak due to not freeing error object. -- 2.35.1