RE: [PATCH v3 1/2] hw/nvme: Support for Namespaces Management from guest OS - create-ns

2022-10-28 Thread Michael Kropaczek (CW)



-Original Message-
From: Keith Busch  
Sent: Thursday, October 27, 2022 1:21 PM
To: Jonathan Derrick 
Cc: qemu-de...@nongnu.org; Michael Kropaczek (CW) 
; qemu-block@nongnu.org; Klaus Jensen 
; Kevin Wolf ; Hanna Reitz 

Subject: Re: [PATCH v3 1/2] hw/nvme: Support for Namespaces Management from 
guest OS - create-ns

[You don't often get email from kbu...@kernel.org. Learn why this is important 
at https://aka.ms/LearnAboutSenderIdentification ]

Caution: External Email


On Thu, Oct 27, 2022 at 01:00:45PM -0500, Jonathan Derrick wrote:
> +Parameters:
> +
> +``auto-ns-path=``
> +  If specified indicates a support for dynamic management of nvme 
> +namespaces
> +  by means of nvme create-ns command. This path points
> +  to the storage area for backend images must exist. Additionally it 
> +requires
> +  that parameter `ns-subsys` must be specified whereas parameter 
> +`drive`
> +  must not. The legacy namespace backend is disabled, instead, a pair 
> +of
> +  files 'nvme__ns_.cfg' and 'nvme__ns_.img'
> +  will refer to respective namespaces. The create-ns, attach-ns
> +  and detach-ns commands, issued at the guest side, will make changes 
> +to
> +  those files accordingly.
> +  For each namespace exists an image file in raw format and a config 
> +file
> +  containing namespace parameters and state of the attachment 
> +allowing QEMU
> +  to configure namespaces accordingly during start up. If for 
> +instance an
> +  image file has a size of 0 bytes this will be interpreted as non 
> +existent
> +  namespace. Issuing create-ns command will change the status in the 
> +config
> +  files and and will re-size the image file accordingly so the image 
> +file
> +  will be associated with the respective namespace. The main config 
> +file
> +  nvme__ctrl.cfg keeps the track of allocated capacity to 
> +the
> +  namespaces within the nvme controller.
> +  As it is the case of a typical hard drive, backend images together 
> +with
> +  config files need to be created. For this reason the qemu-img tool 
> +has
> +  been extended by adding createns command.
> +
> +   qemu-img createns {-S  -C }
> + [-N ] {}
> +
> +  Parameters:
> +  -S and -C and  are mandatory, `-S` must match `serial` 
> + parameter  and  must match `auto-ns-path` parameter of "-device 
> nvme,..."
> +  specification.
> +  -N is optional, if specified it will set a limit to the number of 
> + potential  namespaces and will reduce the number of backend images 
> + and config files  accordingly. As a default, a set of images of 0 
> + bytes size and default  config files for 256 namespaces will be created, a 
> total of 513 files.
> +
> +Please note that ``nvme-ns`` device is not required to support of 
> +dynamic namespaces management feature. It is not prohibited to assign 
> +a such device to ``nvme`` device specified to support dynamic 
> +namespace management if one has an use case to do so, however, it 
> +will only coexist and be out of the scope of Namespaces Management. 
> +NsIds will be consistently managed, creation (create-ns) of a 
> +namespace will not allocate the NsId already being taken. If 
> +``nvme-ns`` device conflicts with previously created one by create-ns (the 
> same NsId), it will break QEMU's start up.

By requiring the controller's serial number up-front, does this mean we can't 
share dynamic namespaces with other controllers in the subsystem?
[MK]: At this point namespaces of one controller can be attached to the 
subsystem. 
[MK]: More controllers would require additional parameter specifying the number 
of controllers and expanding the naming pattern of backend files.
> +static inline char *create_fmt_name(const char *fmt, char 
> +*ns_directory, char *serial, uint32_t nsid, Error **errp) {
> +char *file_name = NULL;
> +Error *local_err = NULL;
> +
> +storage_path_check(ns_directory, serial, errp);
> +if (local_err) {

How is 'local_err' ever *not* NULL here? Are you meaning to check "*errp" 
instead?
[MK]: This will be corrected, indeed a dead code.
> +error_propagate(errp, local_err);
> +} else {
> +file_name = g_strdup_printf(fmt, ns_directory, serial, nsid);
> +}
> +
> +return file_name;
> +}
> +
> +static inline char *create_cfg_name(char *ns_directory, char *serial, 
> +uint32_t nsid, Error **errp) {
> +return create_fmt_name(NS_FILE_FMT NS_CFG_EXT, ns_directory, 
> +serial, nsid, errp); }
> +
> +
> +static inline char *create_image_name(char *ns_directory, char 
> +*serial, uint32_t nsid, Error **errp) {
> +return create_fmt_name(NS_FILE_FMT NS_IMG_EXT, ns_directory, 
> +serial, nsid, errp); 

Re: [PATCH v3 1/2] hw/nvme: Support for Namespaces Management from guest OS - create-ns

2022-10-27 Thread Keith Busch
On Thu, Oct 27, 2022 at 01:00:45PM -0500, Jonathan Derrick wrote:
> +Parameters:
> +
> +``auto-ns-path=``
> +  If specified indicates a support for dynamic management of nvme namespaces
> +  by means of nvme create-ns command. This path points
> +  to the storage area for backend images must exist. Additionally it requires
> +  that parameter `ns-subsys` must be specified whereas parameter `drive`
> +  must not. The legacy namespace backend is disabled, instead, a pair of
> +  files 'nvme__ns_.cfg' and 'nvme__ns_.img'
> +  will refer to respective namespaces. The create-ns, attach-ns
> +  and detach-ns commands, issued at the guest side, will make changes to
> +  those files accordingly.
> +  For each namespace exists an image file in raw format and a config file
> +  containing namespace parameters and state of the attachment allowing QEMU
> +  to configure namespaces accordingly during start up. If for instance an
> +  image file has a size of 0 bytes this will be interpreted as non existent
> +  namespace. Issuing create-ns command will change the status in the config
> +  files and and will re-size the image file accordingly so the image file
> +  will be associated with the respective namespace. The main config file
> +  nvme__ctrl.cfg keeps the track of allocated capacity to the
> +  namespaces within the nvme controller.
> +  As it is the case of a typical hard drive, backend images together with
> +  config files need to be created. For this reason the qemu-img tool has
> +  been extended by adding createns command.
> +
> +   qemu-img createns {-S  -C }
> + [-N ] {}
> +
> +  Parameters:
> +  -S and -C and  are mandatory, `-S` must match `serial` parameter
> +  and  must match `auto-ns-path` parameter of "-device nvme,..."
> +  specification.
> +  -N is optional, if specified it will set a limit to the number of potential
> +  namespaces and will reduce the number of backend images and config files
> +  accordingly. As a default, a set of images of 0 bytes size and default
> +  config files for 256 namespaces will be created, a total of 513 files.
> +
> +Please note that ``nvme-ns`` device is not required to support of dynamic
> +namespaces management feature. It is not prohibited to assign a such device 
> to
> +``nvme`` device specified to support dynamic namespace management if one has
> +an use case to do so, however, it will only coexist and be out of the scope 
> of
> +Namespaces Management. NsIds will be consistently managed, creation 
> (create-ns)
> +of a namespace will not allocate the NsId already being taken. If ``nvme-ns``
> +device conflicts with previously created one by create-ns (the same NsId),
> +it will break QEMU's start up.

By requiring the controller's serial number up-front, does this mean we
can't share dynamic namespaces with other controllers in the subsystem?

> +static inline char *create_fmt_name(const char *fmt, char *ns_directory, 
> char *serial, uint32_t nsid, Error **errp)
> +{
> +char *file_name = NULL;
> +Error *local_err = NULL;
> +
> +storage_path_check(ns_directory, serial, errp);
> +if (local_err) {

How is 'local_err' ever *not* NULL here? Are you meaning to check
"*errp" instead?

> +error_propagate(errp, local_err);
> +} else {
> +file_name = g_strdup_printf(fmt, ns_directory, serial, nsid);
> +}
> +
> +return file_name;
> +}
> +
> +static inline char *create_cfg_name(char *ns_directory, char *serial, 
> uint32_t nsid, Error **errp)
> +{
> +return create_fmt_name(NS_FILE_FMT NS_CFG_EXT, ns_directory, serial, 
> nsid, errp);
> +}
> +
> +
> +static inline char *create_image_name(char *ns_directory, char *serial, 
> uint32_t nsid, Error **errp)
> +{
> +return create_fmt_name(NS_FILE_FMT NS_IMG_EXT, ns_directory, serial, 
> nsid, errp);
> +}
> +
> +static inline int nsid_cfg_save(char *ns_directory, char *serial, QDict 
> *ns_cfg, uint32_t nsid)
> +{
> +GString *json = NULL;
> +char *filename;
> +FILE *fp;
> +int ret = 0;
> +Error *local_err = NULL;
> +
> +json = qobject_to_json_pretty(QOBJECT(ns_cfg), false);
> +
> +if (strlen(json->str) + 2 /* '\n'+'\0' */ > NS_CFG_MAXSIZE) {
> +error_setg(&local_err, "ns-cfg allowed max size %d exceeded", 
> NS_CFG_MAXSIZE);

I find this whole "local_err" control flow unpleasant to follow. The
local_error gets set in this above condition only to be overwritten in
the very next step without ever being used? Surely that can't be right.

I'm just picking on this one example here, but the pattern appears to
repeat. I think this would be easier to read if the error conditions
took 'goto' paths to unwind so that the good path doesn't require such
deep 'if/else' nesting.

> +}
> +
> +filename = create_cfg_name(ns_directory, serial, nsid, &local_err);
> +if (!local_err) {
> +fp = fopen(filename, "w");
> +if (fp == NULL) {
> +error_setg(&local_err, "open %s: %s", filename,
> +  

[PATCH v3 1/2] hw/nvme: Support for Namespaces Management from guest OS - create-ns

2022-10-27 Thread Jonathan Derrick
From: Michael Kropaczek 

Added support for NVMEe NameSpaces Mangement allowing the guest OS to
create namespaces by issuing nvme create-ns command.
It is an extension to currently implemented Qemu nvme virtual device.
Virtual devices representing namespaces will be created and/or deleted
during Qemu's running session, at anytime.

Signed-off-by: Michael Kropaczek 
---
 docs/system/devices/nvme.rst |  55 +++-
 hw/nvme/cfg_key_checker.c|  51 
 hw/nvme/ctrl-cfg.c   | 181 +++
 hw/nvme/ctrl.c   | 204 +-
 hw/nvme/meson.build  |   2 +-
 hw/nvme/ns-backend.c | 234 +++
 hw/nvme/ns.c | 234 ++-
 hw/nvme/nvme.h   |  31 -
 hw/nvme/trace-events |   2 +
 include/block/nvme.h |  30 +
 include/hw/nvme/ctrl-cfg.h   |  24 
 include/hw/nvme/ns-cfg.h |  28 +
 include/hw/nvme/nvme-cfg.h   | 201 ++
 qemu-img-cmds.hx |   6 +
 qemu-img.c   | 134 
 15 files changed, 1380 insertions(+), 37 deletions(-)
 create mode 100644 hw/nvme/cfg_key_checker.c
 create mode 100644 hw/nvme/ctrl-cfg.c
 create mode 100644 hw/nvme/ns-backend.c
 create mode 100644 include/hw/nvme/ctrl-cfg.h
 create mode 100644 include/hw/nvme/ns-cfg.h
 create mode 100644 include/hw/nvme/nvme-cfg.h

diff --git a/docs/system/devices/nvme.rst b/docs/system/devices/nvme.rst
index 30f841ef62..13e2fbc0d6 100644
--- a/docs/system/devices/nvme.rst
+++ b/docs/system/devices/nvme.rst
@@ -92,6 +92,59 @@ There are a number of parameters available:
   attach the namespace to a specific ``nvme`` device (identified by an ``id``
   parameter on the controller device).
 
+Additional Namespaces managed by guest OS Namespaces Management
+-
+
+.. code-block:: console
+
+   -device nvme,id=nvme-ctrl,serial=1234,subsys=nvme-subsys,auto-ns-path=path
+
+Parameters:
+
+``auto-ns-path=``
+  If specified indicates a support for dynamic management of nvme namespaces
+  by means of nvme create-ns command. This path points
+  to the storage area for backend images must exist. Additionally it requires
+  that parameter `ns-subsys` must be specified whereas parameter `drive`
+  must not. The legacy namespace backend is disabled, instead, a pair of
+  files 'nvme__ns_.cfg' and 'nvme__ns_.img'
+  will refer to respective namespaces. The create-ns, attach-ns
+  and detach-ns commands, issued at the guest side, will make changes to
+  those files accordingly.
+  For each namespace exists an image file in raw format and a config file
+  containing namespace parameters and state of the attachment allowing QEMU
+  to configure namespaces accordingly during start up. If for instance an
+  image file has a size of 0 bytes this will be interpreted as non existent
+  namespace. Issuing create-ns command will change the status in the config
+  files and and will re-size the image file accordingly so the image file
+  will be associated with the respective namespace. The main config file
+  nvme__ctrl.cfg keeps the track of allocated capacity to the
+  namespaces within the nvme controller.
+  As it is the case of a typical hard drive, backend images together with
+  config files need to be created. For this reason the qemu-img tool has
+  been extended by adding createns command.
+
+   qemu-img createns {-S  -C }
+ [-N ] {}
+
+  Parameters:
+  -S and -C and  are mandatory, `-S` must match `serial` parameter
+  and  must match `auto-ns-path` parameter of "-device nvme,..."
+  specification.
+  -N is optional, if specified it will set a limit to the number of potential
+  namespaces and will reduce the number of backend images and config files
+  accordingly. As a default, a set of images of 0 bytes size and default
+  config files for 256 namespaces will be created, a total of 513 files.
+
+Please note that ``nvme-ns`` device is not required to support of dynamic
+namespaces management feature. It is not prohibited to assign a such device to
+``nvme`` device specified to support dynamic namespace management if one has
+an use case to do so, however, it will only coexist and be out of the scope of
+Namespaces Management. NsIds will be consistently managed, creation (create-ns)
+of a namespace will not allocate the NsId already being taken. If ``nvme-ns``
+device conflicts with previously created one by create-ns (the same NsId),
+it will break QEMU's start up.
+
 NVM Subsystems
 --
 
@@ -320,4 +373,4 @@ controller are:
 
 .. code-block:: console
 
-   echo :01:00.1 > /sys/bus/pci/drivers/nvme/bind
\ No newline at end of file
+   echo :01:00.1 > /sys/bus/pci/drivers/nvme/bind
diff --git a/hw/nvme/cfg_key_checker.c b/hw/nvme/cfg_key_checker.c
new file mode 100644
index 00..5f19126b29
--- /dev/null