Re: Libvirt NVME support

2020-11-09 Thread Peter Krempa
On Mon, Nov 09, 2020 at 16:38:11 +, Suraj Kasi wrote:
> Hi,
> 
> We wanted to check if it’s possible to specify a disk’s target as nvme (so 
> that the disk shows up as a nvme disk to the guest VM).
> 
> Per libvirt documentation it looks like (since Libvirt 6.0.0) we can specify 
> the disk type as nvme and disks source as a nvme. But the documentation does 
> not say anything about being specify the disk’s target as nvme. Is it 
> possible to present the disk to the guest as a nvme disk, if so how?
> 

NVMe device emulation is not supported at this point. I'm not even sure
what the state of the feature in qemu upstream is.

If you have a real NVMe device, you can obviously use PCI device
assignment with it to pass it to the guest os.



RE: Libvirt NVME support

2020-11-09 Thread Thanos Makatos
> -Original Message-
> From: Peter Krempa 
> Sent: 09 November 2020 16:44
> To: Suraj Kasi 
> Cc: libvirt-l...@redhat.com; Thanos Makatos
> ; John Levon 
> Subject: Re: Libvirt NVME support
> 
> On Mon, Nov 09, 2020 at 16:38:11 +, Suraj Kasi wrote:
> > Hi,
> >
> > We wanted to check if it’s possible to specify a disk’s target as nvme (so
> that the disk shows up as a nvme disk to the guest VM).
> >
> > Per libvirt documentation it looks like (since Libvirt 6.0.0) we can 
> > specify the
> disk type as nvme and disks source as a nvme. But the documentation does
> not say anything about being specify the disk’s target as nvme. Is it possible
> to present the disk to the guest as a nvme disk, if so how?
> >
> 
> NVMe device emulation is not supported at this point. I'm not even sure
> what the state of the feature in qemu upstream is.

In older QEMU versions (~2.12) it was broken, not sure whether it's fixed now. 
In any case, we plan to provide NVMe emulation using SPDK once the multiprocess 
QEMU and vfio-user/out-of-process device emulation patch series are merged.

> 
> If you have a real NVMe device, you can obviously use PCI device
> assignment with it to pass it to the guest os.

We want a _virtual_ NVMe controller in the guest where the backend can be 
connected to anything, e.g. iSCSI, raw block, NVMe, etc.




Re: Libvirt NVME support

2020-11-16 Thread Suraj Kasi
Hi Peter,

Just wanted to follow up. As Thanos mentioned that we want a virtual NVMe 
controller in the guest for which the support doesn't yet exist in libvirt. Is 
it something that would be accepted if we were to implement it?

Thanks,
Suraj

On 11/9/20, 8:54 AM, "Thanos Makatos"  wrote:

> -Original Message-
> From: Peter Krempa 
> Sent: 09 November 2020 16:44
> To: Suraj Kasi 
> Cc: libvirt-l...@redhat.com; Thanos Makatos
> ; John Levon 
> Subject: Re: Libvirt NVME support
> 
> On Mon, Nov 09, 2020 at 16:38:11 +, Suraj Kasi wrote:
> > Hi,
> >
> > We wanted to check if it’s possible to specify a disk’s target as nvme 
(so
> that the disk shows up as a nvme disk to the guest VM).
> >
> > Per libvirt documentation it looks like (since Libvirt 6.0.0) we can 
specify the
> disk type as nvme and disks source as a nvme. But the documentation does
> not say anything about being specify the disk’s target as nvme. Is it 
possible
> to present the disk to the guest as a nvme disk, if so how?
> >
> 
> NVMe device emulation is not supported at this point. I'm not even sure
> what the state of the feature in qemu upstream is.

In older QEMU versions (~2.12) it was broken, not sure whether it's fixed 
now. In any case, we plan to provide NVMe emulation using SPDK once the 
multiprocess QEMU and vfio-user/out-of-process device emulation patch series 
are merged.

> 
> If you have a real NVMe device, you can obviously use PCI device
> assignment with it to pass it to the guest os.

We want a _virtual_ NVMe controller in the guest where the backend can be 
connected to anything, e.g. iSCSI, raw block, NVMe, etc.





Re: Libvirt NVME support

2020-11-18 Thread Peter Krempa
On Mon, Nov 16, 2020 at 23:01:00 +, Suraj Kasi wrote:
> Hi Peter,
> 
> Just wanted to follow up. As Thanos mentioned that we want a virtual NVMe 
> controller in the guest for which the support doesn't yet exist in libvirt. 
> Is it something that would be accepted if we were to implement it?

Sure. Preferably post your proposed design of the XML as a RFC patch on
the list so that the design can be discussed without wasting any
development work first.

As a separate question, is there any performance benefit of emulating a
NVMe controller compared to e.g. virtio-scsi?



RE: Libvirt NVME support

2020-11-18 Thread Thanos Makatos
> As a separate question, is there any performance benefit of emulating a
> NVMe controller compared to e.g. virtio-scsi?

We haven't measured that yet; I would expect it to be slight faster and/or more
CPU efficient but wouldn't be surprised if it isn't. The main benefit of using
NVMe is that we don't have to install VirtIO drivers in the guest.




Re: Libvirt NVME support

2020-11-18 Thread Peter Krempa
On Wed, Nov 18, 2020 at 09:57:14 +, Thanos Makatos wrote:
> > As a separate question, is there any performance benefit of emulating a
> > NVMe controller compared to e.g. virtio-scsi?
> 
> We haven't measured that yet; I would expect it to be slight faster and/or 
> more
> CPU efficient but wouldn't be surprised if it isn't. The main benefit of using
> NVMe is that we don't have to install VirtIO drivers in the guest.

Okay, I'm not sold on the drivers bit but that is definitely not a
problem in regards of adding support for emulating NVME controllers to
libvirt.

As a starting point a trivial way to model this in the XML will be:



And then add the storage into it as:


  
  
  



  
  
  


The 'drive' address here maps the disk to the controller. This example
uses unit= as the way to specify the namespace ID. Both 'bus' and 'target'
must be 0.

You can theoretically also add your own address type if 'drive' doesn't
look right.

This model will have problems with hotplug/unplug if the NVMe spec
doesn't actually allow hotplug of a single namespace into a controller
as libvirt's hotplug APIs only deal with one element at time.

We theoretically could work this around by allowing hotplug of disks
which correspond to the namespace used while the controller was not
attached yet, and the attach of the controller then attaches both the
backends and the controller. This is a bit hacky though.

Another obvious solution is to disallow hotplug of the namespaces and
thus also the controller.



Re: Libvirt NVME support

2020-11-18 Thread Michal Privoznik

On 11/18/20 11:24 AM, Peter Krempa wrote:

On Wed, Nov 18, 2020 at 09:57:14 +, Thanos Makatos wrote:

As a separate question, is there any performance benefit of emulating a
NVMe controller compared to e.g. virtio-scsi?


We haven't measured that yet; I would expect it to be slight faster and/or more
CPU efficient but wouldn't be surprised if it isn't. The main benefit of using
NVMe is that we don't have to install VirtIO drivers in the guest.


Okay, I'm not sold on the drivers bit but that is definitely not a
problem in regards of adding support for emulating NVME controllers to
libvirt.

As a starting point a trivial way to model this in the XML will be:

 

And then add the storage into it as:

 
   
   
   
 

 
   
   
   
 

The 'drive' address here maps the disk to the controller. This example
uses unit= as the way to specify the namespace ID. Both 'bus' and 'target'
must be 0.

You can theoretically also add your own address type if 'drive' doesn't
look right.

This model will have problems with hotplug/unplug if the NVMe spec
doesn't actually allow hotplug of a single namespace into a controller
as libvirt's hotplug APIs only deal with one element at time.

We theoretically could work this around by allowing hotplug of disks
which correspond to the namespace used while the controller was not
attached yet, and the attach of the controller then attaches both the
backends and the controller. This is a bit hacky though.

Another obvious solution is to disallow hotplug of the namespaces and
thus also the controller.



Would it make sense to relax the current limitation in libvirt and allow 
 (which is meant for cases where the backend is a 
NVMe disk) to be on something else than 'virtio' bus?


Michal



Re: Libvirt NVME support

2020-11-18 Thread Peter Krempa
On Wed, Nov 18, 2020 at 20:31:03 +0100, Michal Privoznik wrote:
> On 11/18/20 11:24 AM, Peter Krempa wrote:
> > On Wed, Nov 18, 2020 at 09:57:14 +, Thanos Makatos wrote:

[...]

> Would it make sense to relax the current limitation in libvirt and allow
>  (which is meant for cases where the backend is a NVMe
> disk) to be on something else than 'virtio' bus?

This is really orthogonal to the emulated NVMe controller.

A  can theoretically back any disk frontend. I don't
remember now why we actually mandate it only for virtio. Do you?

Apart from that, it doesn't make that much sense to use 

RE: Libvirt NVME support

2020-11-19 Thread Thanos Makatos
> As a starting point a trivial way to model this in the XML will be:
> 
> 
> 
> And then add the storage into it as:
> 
> 
>   
>   

'target dev' is how the device appears in the guest, right? It should be
something like 'nvme0n1'. I'm  not sure though this is something that we can
put here anyway, I think the guest driver can number controllers arbitrarily.
Maybe we should specify something like BDF? Or is this something QEMU will
have to figure out how to do?

>   
> 
> 
> 
>   
>   
>   
> 
> 
> The 'drive' address here maps the disk to the controller. This example
> uses unit= as the way to specify the namespace ID. Both 'bus' and 'target'
> must be 0.
> 
> You can theoretically also add your own address type if 'drive' doesn't
> look right.
> 
> This model will have problems with hotplug/unplug if the NVMe spec
> doesn't actually allow hotplug of a single namespace into a controller
> as libvirt's hotplug APIs only deal with one element at time.

The NVMe spec does allow hotplug/unplug of namespaces, so libvirt should be
fine supporting this?




Re: Libvirt NVME support

2020-11-19 Thread Peter Krempa
On Thu, Nov 19, 2020 at 10:17:56 +, Thanos Makatos wrote:
> > As a starting point a trivial way to model this in the XML will be:
> > 
> > 
> > 
> > And then add the storage into it as:
> > 
> > 
> >   
> >   
> 
> 'target dev' is how the device appears in the guest, right? It should be
> something like 'nvme0n1'. I'm  not sure though this is something that we can
> put here anyway, I think the guest driver can number controllers arbitrarily.

Well, it was supposed to be like that but really is not. Even with other
buses the kernel can name the device arbitrarily, so it doesn't really
matter.

> Maybe we should specify something like BDF? Or is this something QEMU will
> have to figure out how to do?
> 
> >   
> > 
> > 
> > 
> >   
> >   
> >   
> > 
> > 
> > The 'drive' address here maps the disk to the controller. This example
> > uses unit= as the way to specify the namespace ID. Both 'bus' and 'target'
> > must be 0.
> > 
> > You can theoretically also add your own address type if 'drive' doesn't
> > look right.
> > 
> > This model will have problems with hotplug/unplug if the NVMe spec
> > doesn't actually allow hotplug of a single namespace into a controller
> > as libvirt's hotplug APIs only deal with one element at time.
> 
> The NVMe spec does allow hotplug/unplug of namespaces, so libvirt should be
> fine supporting this?

Ah, cool in such case there shouldn't be any problem. You can attach a
controller and then attach namespaces to it or the other way around.

Problem would be if the namespace would need to be attached
simultaneously with the controller.



RE: Libvirt NVME support

2020-11-23 Thread Thanos Makatos
> On Thu, Nov 19, 2020 at 10:17:56 +, Thanos Makatos wrote:
> > > As a starting point a trivial way to model this in the XML will be:
> > >
> > > 
> > >
> > > And then add the storage into it as:
> > >
> > > 
> > >   
> > >   
> >
> > 'target dev' is how the device appears in the guest, right? It should be
> > something like 'nvme0n1'. I'm  not sure though this is something that we
> can
> > put here anyway, I think the guest driver can number controllers 
> > arbitrarily.
> 
> Well, it was supposed to be like that but really is not. Even with other
> buses the kernel can name the device arbitrarily, so it doesn't really
> matter.
> 
> > Maybe we should specify something like BDF? Or is this something QEMU
> will
> > have to figure out how to do?
> >
> > >   
> > > 
> > >
> > > 
> > >   
> > >   
> > >   
> > > 

Revistiting your initial suggestion, it should be something like this
(s/sdb/nvme0):


  
  
  


> > >
> > > The 'drive' address here maps the disk to the controller. This example

IIUC we need a way to associate storage (this XML snippet) with the controller
you defined earlier (). So
shouldn't we only require associating this piece of storage with the controller
based on the index?

> > > uses unit= as the way to specify the namespace ID. Both 'bus' and 'target'
> > > must be 0.

I think 'namespace' or 'ns' would be more suitable instead of 'unit'.
What are 'bus' and 'target' here? And why do they have to be 0?
Do we really need dev='nvme0' in ? Specifying the controller index
should be enough, no?

Wouldn't this contain the minimum amount of information to unambiguously map
this piece of storage to the controller?


  
  
  





Re: Libvirt NVME support

2020-11-23 Thread Peter Krempa
On Mon, Nov 23, 2020 at 09:47:23 +, Thanos Makatos wrote:
> > On Thu, Nov 19, 2020 at 10:17:56 +, Thanos Makatos wrote:
> > > > As a starting point a trivial way to model this in the XML will be:
> > > >
> > > > 
> > > >
> > > > And then add the storage into it as:
> > > >
> > > > 
> > > >   
> > > >   
> > >
> > > 'target dev' is how the device appears in the guest, right? It should be
> > > something like 'nvme0n1'. I'm  not sure though this is something that we
> > can
> > > put here anyway, I think the guest driver can number controllers 
> > > arbitrarily.
> > 
> > Well, it was supposed to be like that but really is not. Even with other
> > buses the kernel can name the device arbitrarily, so it doesn't really
> > matter.
> > 
> > > Maybe we should specify something like BDF? Or is this something QEMU
> > will
> > > have to figure out how to do?
> > >
> > > >   
> > > > 
> > > >
> > > > 
> > > >   
> > > >   
> > > >   
> > > > 
> 
> Revistiting your initial suggestion, it should be something like this
> (s/sdb/nvme0):
> 
> 
>   
>   
>   
> 

Note that the parser for 'dev' is a bit quirky, old, and used in many
places besides the qemu driver. It's also used with numbers in non-qemu
cases. Extending that to parse numbers for nvme but not for sda might
become ugly very quickly. Sticking with a letter at the end ('nvmea'
might be a more straightforward approach.

>   
> > > >
> > > > The 'drive' address here maps the disk to the controller. This example
> 
> IIUC we need a way to associate storage (this XML snippet) with the controller
> you defined earlier (). So
> shouldn't we only require associating this piece of storage with the 
> controller
> based on the index?

No. The common approach is to do it via what's specified as 

> 
> > > > uses unit= as the way to specify the namespace ID. Both 'bus' and 
> > > > 'target'
> > > > must be 0.
> 
> I think 'namespace' or 'ns' would be more suitable instead of 'unit'.
> What are 'bus' and 'target' here? And why do they have to be 0?
> Do we really need dev='nvme0' in ? Specifying the controller index
> should be enough, no?

You certainly can add 

> Wouldn't this contain the minimum amount of information to unambiguously map
> this piece of storage to the controller?
> 
> 
>   
>   
>   
> 

That certainly is correct if you include the "type='nvme'" attribute.



RE: Libvirt NVME support

2020-11-23 Thread Thanos Makatos


> On Mon, Nov 23, 2020 at 09:47:23 +, Thanos Makatos wrote:
> > > On Thu, Nov 19, 2020 at 10:17:56 +, Thanos Makatos wrote:
> > > > > As a starting point a trivial way to model this in the XML will be:
> > > > >
> > > > > 
> > > > >
> > > > > And then add the storage into it as:
> > > > >
> > > > > 
> > > > >   
> > > > >   
> > > >
> > > > 'target dev' is how the device appears in the guest, right? It should be
> > > > something like 'nvme0n1'. I'm  not sure though this is something that
> we
> > > can
> > > > put here anyway, I think the guest driver can number controllers
> arbitrarily.
> > >
> > > Well, it was supposed to be like that but really is not. Even with other
> > > buses the kernel can name the device arbitrarily, so it doesn't really
> > > matter.
> > >
> > > > Maybe we should specify something like BDF? Or is this something
> QEMU
> > > will
> > > > have to figure out how to do?
> > > >
> > > > >> > > > unit='0'/>
> > > > > 
> > > > >
> > > > > 
> > > > >   
> > > > >   
> > > > >> > > > unit='1'/>
> > > > > 
> >
> > Revistiting your initial suggestion, it should be something like this
> > (s/sdb/nvme0):
> >
> > 
> >   
> >   
> >   
> > 
> 
> Note that the parser for 'dev' is a bit quirky, old, and used in many
> places besides the qemu driver. It's also used with numbers in non-qemu
> cases. Extending that to parse numbers for nvme but not for sda might
> become ugly very quickly. Sticking with a letter at the end ('nvmea'
> might be a more straightforward approach.

Then I think we should just stick with 'nvme'.

> 
> >
> > > > >
> > > > > The 'drive' address here maps the disk to the controller. This example
> >
> > IIUC we need a way to associate storage (this XML snippet) with the
> controller
> > you defined earlier (). So
> > shouldn't we only require associating this piece of storage with the
> controller
> > based on the index?
> 
> No. The common approach is to do it via what's specified as 
> 
> >
> > > > > uses unit= as the way to specify the namespace ID. Both 'bus' and
> 'target'
> > > > > must be 0.
> >
> > I think 'namespace' or 'ns' would be more suitable instead of 'unit'.
> > What are 'bus' and 'target' here? And why do they have to be 0?
> > Do we really need dev='nvme0' in ? Specifying the controller
> index
> > should be enough, no?
> 
> You certainly can add 
> 
> > Wouldn't this contain the minimum amount of information to
> unambiguously map
> > this piece of storage to the controller?
> >
> > 
> >   
> >   
> >   
> > 
> 
> That certainly is correct if you include the "type='nvme'" attribute.

Great, so the following would be a good place for us to start?




  
  
  



  
  
  





Re: Libvirt NVME support

2020-11-23 Thread Peter Krempa
On Mon, Nov 23, 2020 at 13:07:51 +, Thanos Makatos wrote:
> 
> > On Mon, Nov 23, 2020 at 09:47:23 +, Thanos Makatos wrote:
> > > > On Thu, Nov 19, 2020 at 10:17:56 +, Thanos Makatos wrote:
> > > > > > As a starting point a trivial way to model this in the XML will be:
> > > > > >
> > > > > > 
> > > > > >
> > > > > > And then add the storage into it as:
> > > > > >
> > > > > > 
> > > > > >   
> > > > > >   
> > > > >
> > > > > 'target dev' is how the device appears in the guest, right? It should 
> > > > > be
> > > > > something like 'nvme0n1'. I'm  not sure though this is something that
> > we
> > > > can
> > > > > put here anyway, I think the guest driver can number controllers
> > arbitrarily.
> > > >
> > > > Well, it was supposed to be like that but really is not. Even with other
> > > > buses the kernel can name the device arbitrarily, so it doesn't really
> > > > matter.
> > > >
> > > > > Maybe we should specify something like BDF? Or is this something
> > QEMU
> > > > will
> > > > > have to figure out how to do?
> > > > >
> > > > > >> > > > > unit='0'/>
> > > > > > 
> > > > > >
> > > > > > 
> > > > > >   
> > > > > >   
> > > > > >> > > > > unit='1'/>
> > > > > > 
> > >
> > > Revistiting your initial suggestion, it should be something like this
> > > (s/sdb/nvme0):
> > >
> > > 
> > >   
> > >   
> > >   
> > > 
> > 
> > Note that the parser for 'dev' is a bit quirky, old, and used in many
> > places besides the qemu driver. It's also used with numbers in non-qemu
> > cases. Extending that to parse numbers for nvme but not for sda might
> > become ugly very quickly. Sticking with a letter at the end ('nvmea'
> > might be a more straightforward approach.
> 
> Then I think we should just stick with 'nvme'.

You still need a way to "index" it somehow. The target must be unique
for each disk.

[...]


> > That certainly is correct if you include the "type='nvme'" attribute.
> 
> Great, so the following would be a good place for us to start?
> 
> 
> 
> 
>   
>   
>   
> 
> 
> 
>   
>   
>   
> 

The address is reasonable this way.



Re: Libvirt NVME support

2020-11-23 Thread Daniel P . Berrangé
On Wed, Nov 18, 2020 at 11:24:30AM +0100, Peter Krempa wrote:
> On Wed, Nov 18, 2020 at 09:57:14 +, Thanos Makatos wrote:
> > > As a separate question, is there any performance benefit of emulating a
> > > NVMe controller compared to e.g. virtio-scsi?
> > 
> > We haven't measured that yet; I would expect it to be slight faster and/or 
> > more
> > CPU efficient but wouldn't be surprised if it isn't. The main benefit of 
> > using
> > NVMe is that we don't have to install VirtIO drivers in the guest.
> 
> Okay, I'm not sold on the drivers bit but that is definitely not a
> problem in regards of adding support for emulating NVME controllers to
> libvirt.
> 
> As a starting point a trivial way to model this in the XML will be:
> 
> 
> 
> And then add the storage into it as:
> 
> 
>   
>   
>   
> 
> 
> 
>   
>   
>   
> 
> 
> The 'drive' address here maps the disk to the controller. This example
> uses unit= as the way to specify the namespace ID. Both 'bus' and 'target'
> must be 0.

FWIW, I think that our overloeading of type=drive for FDC, IDE, and SCSI
was a mistake in retrospect. We should have had type=fdc, type=ide, type=scsi,
since each uses a different subset of the attributes.

Lets not continue this mistake with NVME - create a type=nvme address
type.

I also wonder whether device='disk' makes sense too, as opposed to using
device='nvme', since this is really not very similar to classic disks.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: Libvirt NVME support

2020-11-23 Thread Michal Prívozník

On 11/23/20 3:03 PM, Daniel P. Berrangé wrote:

On Wed, Nov 18, 2020 at 11:24:30AM +0100, Peter Krempa wrote:

On Wed, Nov 18, 2020 at 09:57:14 +, Thanos Makatos wrote:

As a separate question, is there any performance benefit of emulating a
NVMe controller compared to e.g. virtio-scsi?


We haven't measured that yet; I would expect it to be slight faster and/or more
CPU efficient but wouldn't be surprised if it isn't. The main benefit of using
NVMe is that we don't have to install VirtIO drivers in the guest.


Okay, I'm not sold on the drivers bit but that is definitely not a
problem in regards of adding support for emulating NVME controllers to
libvirt.

As a starting point a trivial way to model this in the XML will be:

 

And then add the storage into it as:

 
   
   
   
 

 
   
   
   
 

The 'drive' address here maps the disk to the controller. This example
uses unit= as the way to specify the namespace ID. Both 'bus' and 'target'
must be 0.


FWIW, I think that our overloeading of type=drive for FDC, IDE, and SCSI
was a mistake in retrospect. We should have had type=fdc, type=ide, type=scsi,
since each uses a different subset of the attributes.

Lets not continue this mistake with NVME - create a type=nvme address
type.


Don't NVMes live on a PCI(e) bus? Can't we just threat NVMes as PCI 
devices? Or are we targeting sata too? Bcause we also have that type of 
address.


Michal



Re: Libvirt NVME support

2020-11-23 Thread Daniel P . Berrangé
On Mon, Nov 23, 2020 at 03:32:20PM +0100, Michal Prívozník wrote:
> On 11/23/20 3:03 PM, Daniel P. Berrangé wrote:
> > On Wed, Nov 18, 2020 at 11:24:30AM +0100, Peter Krempa wrote:
> > > On Wed, Nov 18, 2020 at 09:57:14 +, Thanos Makatos wrote:
> > > > > As a separate question, is there any performance benefit of emulating 
> > > > > a
> > > > > NVMe controller compared to e.g. virtio-scsi?
> > > > 
> > > > We haven't measured that yet; I would expect it to be slight faster 
> > > > and/or more
> > > > CPU efficient but wouldn't be surprised if it isn't. The main benefit 
> > > > of using
> > > > NVMe is that we don't have to install VirtIO drivers in the guest.
> > > 
> > > Okay, I'm not sold on the drivers bit but that is definitely not a
> > > problem in regards of adding support for emulating NVME controllers to
> > > libvirt.
> > > 
> > > As a starting point a trivial way to model this in the XML will be:
> > > 
> > >  
> > > 
> > > And then add the storage into it as:
> > > 
> > >  
> > >
> > >
> > >
> > >  
> > > 
> > >  
> > >
> > >
> > >
> > >  
> > > 
> > > The 'drive' address here maps the disk to the controller. This example
> > > uses unit= as the way to specify the namespace ID. Both 'bus' and 'target'
> > > must be 0.
> > 
> > FWIW, I think that our overloeading of type=drive for FDC, IDE, and SCSI
> > was a mistake in retrospect. We should have had type=fdc, type=ide, 
> > type=scsi,
> > since each uses a different subset of the attributes.
> > 
> > Lets not continue this mistake with NVME - create a type=nvme address
> > type.
> 
> Don't NVMes live on a PCI(e) bus? Can't we just threat NVMes as PCI devices?
> Or are we targeting sata too? Bcause we also have that type of address.

IIUC, the  NVME *controller* lives on a PCI bus, and it can have any number
of namespaces associated with it. In real hardware the namespaces can be
dynamically changed on the fly. So these  elements are the namespaces,
not the controller, hence PCI isn't relevant AFAICT except for the
 device.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: Libvirt NVME support

2020-11-23 Thread Peter Krempa
On Mon, Nov 23, 2020 at 15:32:20 +0100, Michal Privoznik wrote:
> On 11/23/20 3:03 PM, Daniel P. Berrangé wrote:
> > On Wed, Nov 18, 2020 at 11:24:30AM +0100, Peter Krempa wrote:
> > > On Wed, Nov 18, 2020 at 09:57:14 +, Thanos Makatos wrote:
> > > > > As a separate question, is there any performance benefit of emulating 
> > > > > a
> > > > > NVMe controller compared to e.g. virtio-scsi?
> > > > 
> > > > We haven't measured that yet; I would expect it to be slight faster 
> > > > and/or more
> > > > CPU efficient but wouldn't be surprised if it isn't. The main benefit 
> > > > of using
> > > > NVMe is that we don't have to install VirtIO drivers in the guest.
> > > 
> > > Okay, I'm not sold on the drivers bit but that is definitely not a
> > > problem in regards of adding support for emulating NVME controllers to
> > > libvirt.
> > > 
> > > As a starting point a trivial way to model this in the XML will be:
> > > 
> > >  
> > > 
> > > And then add the storage into it as:
> > > 
> > >  
> > >
> > >
> > >
> > >  
> > > 
> > >  
> > >
> > >
> > >
> > >  
> > > 
> > > The 'drive' address here maps the disk to the controller. This example
> > > uses unit= as the way to specify the namespace ID. Both 'bus' and 'target'
> > > must be 0.
> > 
> > FWIW, I think that our overloeading of type=drive for FDC, IDE, and SCSI
> > was a mistake in retrospect. We should have had type=fdc, type=ide, 
> > type=scsi,
> > since each uses a different subset of the attributes.
> > 
> > Lets not continue this mistake with NVME - create a type=nvme address
> > type.
> 
> Don't NVMes live on a PCI(e) bus? Can't we just threat NVMes as PCI devices?
> Or are we targeting sata too? Bcause we also have that type of address.

No, the NVMe controller lives on PCIe. Here we are trying to emulate a
NVMe controller (as  if you look elsewhere in the other
subthread. The  element here maps to individual emulated
namespaces for the emulated NVMe controller.

If we'd try to map one  per PCIe device, you'd prevent us from
emulating multiple namespaces.



Re: Libvirt NVME support

2020-11-23 Thread Daniel P . Berrangé
On Mon, Nov 23, 2020 at 03:36:42PM +0100, Peter Krempa wrote:
> On Mon, Nov 23, 2020 at 15:32:20 +0100, Michal Privoznik wrote:
> > On 11/23/20 3:03 PM, Daniel P. Berrangé wrote:
> > > On Wed, Nov 18, 2020 at 11:24:30AM +0100, Peter Krempa wrote:
> > > > On Wed, Nov 18, 2020 at 09:57:14 +, Thanos Makatos wrote:
> > > > > > As a separate question, is there any performance benefit of 
> > > > > > emulating a
> > > > > > NVMe controller compared to e.g. virtio-scsi?
> > > > > 
> > > > > We haven't measured that yet; I would expect it to be slight faster 
> > > > > and/or more
> > > > > CPU efficient but wouldn't be surprised if it isn't. The main benefit 
> > > > > of using
> > > > > NVMe is that we don't have to install VirtIO drivers in the guest.
> > > > 
> > > > Okay, I'm not sold on the drivers bit but that is definitely not a
> > > > problem in regards of adding support for emulating NVME controllers to
> > > > libvirt.
> > > > 
> > > > As a starting point a trivial way to model this in the XML will be:
> > > > 
> > > >  
> > > > 
> > > > And then add the storage into it as:
> > > > 
> > > >  
> > > >
> > > >
> > > > > > > unit='0'/>
> > > >  
> > > > 
> > > >  
> > > >
> > > >
> > > > > > > unit='1'/>
> > > >  
> > > > 
> > > > The 'drive' address here maps the disk to the controller. This example
> > > > uses unit= as the way to specify the namespace ID. Both 'bus' and 
> > > > 'target'
> > > > must be 0.
> > > 
> > > FWIW, I think that our overloeading of type=drive for FDC, IDE, and SCSI
> > > was a mistake in retrospect. We should have had type=fdc, type=ide, 
> > > type=scsi,
> > > since each uses a different subset of the attributes.
> > > 
> > > Lets not continue this mistake with NVME - create a type=nvme address
> > > type.
> > 
> > Don't NVMes live on a PCI(e) bus? Can't we just threat NVMes as PCI devices?
> > Or are we targeting sata too? Bcause we also have that type of address.
> 
> No, the NVMe controller lives on PCIe. Here we are trying to emulate a
> NVMe controller (as  if you look elsewhere in the other
> subthread. The  element here maps to individual emulated
> namespaces for the emulated NVMe controller.
> 
> If we'd try to map one  per PCIe device, you'd prevent us from
> emulating multiple namespaces.

The odd thing here is that we're trying expose different host backing
store for each namespace, hence the need to expose multiple .

Does it even make sense if you expose a namespace "2" without first
exposing a namespace "1" ?

It makes me a little uneasy, as it feels like  trying to export an
regular disk, where we have a different host backing store for each
partition. The difference I guess is that partition tables are a purely
software construct, where as namespaces are a hardware construct.
Exposing individual partitions to a disk was done in Xen, but most
people think it was kind of a mistake, as you could get a partition
without any containing disk. At least in this case we do have a
NVME controller present so the namespace isn't orphaned, like the
old Xen partitons.



The alternative is to say only one host backing store, and then either
let the guest dynamically carve it up into namespaces, or have some
data format in the host backing store to represent the namespaces, or
have an XML element to specify the regions of host backing that
correspond to namespaces, eg

  
 
 
 



 
 
  

this is of course less flexible, and I'm not entirely serious about
suggesting this, but its an option that exists none the less.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: Libvirt NVME support

2020-11-23 Thread Peter Krempa
On Mon, Nov 23, 2020 at 15:01:31 +, Daniel Berrange wrote:
> On Mon, Nov 23, 2020 at 03:36:42PM +0100, Peter Krempa wrote:
> > On Mon, Nov 23, 2020 at 15:32:20 +0100, Michal Privoznik wrote:

[...]

> > No, the NVMe controller lives on PCIe. Here we are trying to emulate a
> > NVMe controller (as  if you look elsewhere in the other
> > subthread. The  element here maps to individual emulated
> > namespaces for the emulated NVMe controller.
> > 
> > If we'd try to map one  per PCIe device, you'd prevent us from
> > emulating multiple namespaces.
> 
> The odd thing here is that we're trying expose different host backing
> store for each namespace, hence the need to expose multiple .
> 
> Does it even make sense if you expose a namespace "2" without first
> exposing a namespace "1" ?

[1]

> 
> It makes me a little uneasy, as it feels like  trying to export an
> regular disk, where we have a different host backing store for each
> partition. The difference I guess is that partition tables are a purely
> software construct, where as namespaces are a hardware construct.

For this purpose I viewed the namespace to be akin to a LUN on a
SCSI bus. For now controllers usually usually have just one namespace
and the storage is directly connected to it.

In the other subthread I've specifically asked whether the nvme standard
has a notion of namespace hotplug. Since it does it seems to be very
similar to how we deal with SCSI disks.

Ad [1]. That can be a limitation here. I wonder actually if you can have
0 namespaces. If that's possible then the model still holds. Obviously
if we can't have 0 namespaces hotplug would be impossible.

> Exposing individual partitions to a disk was done in Xen, but most
> people think it was kind of a mistake, as you could get a partition
> without any containing disk. At least in this case we do have a
> NVME controller present so the namespace isn't orphaned, like the
> old Xen partitons.

Well, the difference is that the nvme device node in linux actually 
consists of 3 separate parts:

/dev/nvme0n1p1:

/dev/nvme0
- controller

  n1

- namespace

p1

- partition

In this case we end up at the namespace component, so we don't really
deal in any way with partition. It's actually more similar to SCSI
albeit the SCSI naming in linux does in no way include the controller
which actually creates a mess.

> The alternative is to say only one host backing store, and then either
> let the guest dynamically carve it up into namespaces, or have some
> data format in the host backing store to represent the namespaces, or
> have an XML element to specify the regions of host backing that
> correspond to namespaces, eg
> 
>   
>  
>  
>  
> 
> 
> 
>  
>  
>   
> 
> this is of course less flexible, and I'm not entirely serious about
> suggesting this, but its an option that exists none the less.

Eww. This is disgusting and borderline useless if you ever want to
modify the backing image, but it certainly can be achieved with multiple
'raw' format drivers.

I don't think the NVMe standard mandates that the memory backing the
namespace must be the same for all namespaces.

For a less disgusting and more usable setup, the namespace element can
be a collection of  elements.

The above also will require use of virDomainUpdateDevice if you'd want
to change the backing store in any way since that's possible.



RE: Libvirt NVME support

2020-11-23 Thread Thanos Makatos
> On Mon, Nov 23, 2020 at 13:07:51 +, Thanos Makatos wrote:
> >
> > > On Mon, Nov 23, 2020 at 09:47:23 +, Thanos Makatos wrote:
> > > > > On Thu, Nov 19, 2020 at 10:17:56 +, Thanos Makatos wrote:
> > > > > > > As a starting point a trivial way to model this in the XML will 
> > > > > > > be:
> > > > > > >
> > > > > > > 
> > > > > > >
> > > > > > > And then add the storage into it as:
> > > > > > >
> > > > > > > 
> > > > > > >   
> > > > > > >   
> > > > > >
> > > > > > 'target dev' is how the device appears in the guest, right? It 
> > > > > > should
> be
> > > > > > something like 'nvme0n1'. I'm  not sure though this is something
> that
> > > we
> > > > > can
> > > > > > put here anyway, I think the guest driver can number controllers
> > > arbitrarily.
> > > > >
> > > > > Well, it was supposed to be like that but really is not. Even with 
> > > > > other
> > > > > buses the kernel can name the device arbitrarily, so it doesn't really
> > > > > matter.
> > > > >
> > > > > > Maybe we should specify something like BDF? Or is this something
> > > QEMU
> > > > > will
> > > > > > have to figure out how to do?
> > > > > >
> > > > > > >> > > > > > unit='0'/>
> > > > > > > 
> > > > > > >
> > > > > > > 
> > > > > > >   
> > > > > > >   
> > > > > > >> > > > > > unit='1'/>
> > > > > > > 
> > > >
> > > > Revistiting your initial suggestion, it should be something like this
> > > > (s/sdb/nvme0):
> > > >
> > > > 
> > > >   
> > > >   
> > > >   
> > > > 
> > >
> > > Note that the parser for 'dev' is a bit quirky, old, and used in many
> > > places besides the qemu driver. It's also used with numbers in non-qemu
> > > cases. Extending that to parse numbers for nvme but not for sda might
> > > become ugly very quickly. Sticking with a letter at the end ('nvmea'
> > > might be a more straightforward approach.
> >
> > Then I think we should just stick with 'nvme'.
> 
> You still need a way to "index" it somehow. The target must be unique
> for each disk.

I think I've misunderstood something, I thought controller='1' in 
refers to index='1' in . So  should be:



What's controller='1' then?




Re: Libvirt NVME support

2020-11-23 Thread Peter Krempa
On Mon, Nov 23, 2020 at 16:48:55 +, Thanos Makatos wrote:
> > On Mon, Nov 23, 2020 at 13:07:51 +, Thanos Makatos wrote:
> > >
> > > > On Mon, Nov 23, 2020 at 09:47:23 +, Thanos Makatos wrote:
> > > > > > On Thu, Nov 19, 2020 at 10:17:56 +, Thanos Makatos wrote:

> > > > >
> > > > > Revistiting your initial suggestion, it should be something like this
> > > > > (s/sdb/nvme0):
> > > > >
> > > > > 
> > > > >   
> > > > >   
> > > > >> > > > unit='1'/>
> > > > > 
> > > >
> > > > Note that the parser for 'dev' is a bit quirky, old, and used in many
> > > > places besides the qemu driver. It's also used with numbers in non-qemu
> > > > cases. Extending that to parse numbers for nvme but not for sda might
> > > > become ugly very quickly. Sticking with a letter at the end ('nvmea'
> > > > might be a more straightforward approach.
> > >
> > > Then I think we should just stick with 'nvme'.
> > 
> > You still need a way to "index" it somehow. The target must be unique
> > for each disk.
> 
> I think I've misunderstood something, I thought controller='1' in  ...>
> refers to index='1' in . So  should be:
> 
> 
> 
> What's controller='1' then?

What I meant by the above is that the value of ". I also wanted to advice to not use numbers
for making it unique. Numbers used for it have a legacy meaning.

Your suggested 

RE: Libvirt NVME support

2020-11-23 Thread Thanos Makatos



> -Original Message-
> From: Peter Krempa 
> Sent: 23 November 2020 15:20
> To: Daniel P. Berrangé 
> Cc: Michal Prívozník ; Thanos Makatos
> ; Suraj Kasi ;
> libvirt-l...@redhat.com; John Levon 
> Subject: Re: Libvirt NVME support
> 
> On Mon, Nov 23, 2020 at 15:01:31 +, Daniel Berrange wrote:
> > On Mon, Nov 23, 2020 at 03:36:42PM +0100, Peter Krempa wrote:
> > > On Mon, Nov 23, 2020 at 15:32:20 +0100, Michal Privoznik wrote:
> 
> [...]
> 
> > > No, the NVMe controller lives on PCIe. Here we are trying to emulate a
> > > NVMe controller (as  if you look elsewhere in the other
> > > subthread. The  element here maps to individual emulated
> > > namespaces for the emulated NVMe controller.
> > >
> > > If we'd try to map one  per PCIe device, you'd prevent us from
> > > emulating multiple namespaces.
> >
> > The odd thing here is that we're trying expose different host backing
> > store for each namespace, hence the need to expose multiple .
> >
> > Does it even make sense if you expose a namespace "2" without first
> > exposing a namespace "1" ?
> 
> [1]
> 
> >
> > It makes me a little uneasy, as it feels like  trying to export an
> > regular disk, where we have a different host backing store for each
> > partition. The difference I guess is that partition tables are a purely
> > software construct, where as namespaces are a hardware construct.
> 
> For this purpose I viewed the namespace to be akin to a LUN on a
> SCSI bus. For now controllers usually usually have just one namespace
> and the storage is directly connected to it.
> 
> In the other subthread I've specifically asked whether the nvme standard
> has a notion of namespace hotplug. Since it does it seems to be very
> similar to how we deal with SCSI disks.
> 
> Ad [1]. That can be a limitation here. I wonder actually if you can have
> 0 namespaces. If that's possible then the model still holds. Obviously
> if we can't have 0 namespaces hotplug would be impossible.

It is possible to have a controller with no namespaces at all or to have gaps in
the namespace IDs, there's no requirement to start from 1. Controllers start
from 1 since that's the sensible thing to do. We can end up in situations with
random namespace IDs simply by adding and deleting namespaces.

> 
> > Exposing individual partitions to a disk was done in Xen, but most
> > people think it was kind of a mistake, as you could get a partition
> > without any containing disk. At least in this case we do have a
> > NVME controller present so the namespace isn't orphaned, like the
> > old Xen partitons.
> 
> Well, the difference is that the nvme device node in linux actually
> consists of 3 separate parts:
> 
> /dev/nvme0n1p1:
> 
> /dev/nvme0
> - controller
> 
>   n1
> 
> - namespace
> 
> p1
> 
> - partition
> 
> In this case we end up at the namespace component, so we don't really
> deal in any way with partition. It's actually more similar to SCSI
> albeit the SCSI naming in linux does in no way include the controller
> which actually creates a mess.

Agreed, the partition exists solely within the host, so this isn't a problem.
Also, I think the analogy of SCSI controller == NVMe controller and
SCSI LUN == NVMe namespace is pretty accurate for all practical purposes.

> 
> > The alternative is to say only one host backing store, and then either
> > let the guest dynamically carve it up into namespaces, or have some
> > data format in the host backing store to represent the namespaces, or
> > have an XML element to specify the regions of host backing that
> > correspond to namespaces, eg
> >
> >   
> >  
> >  
> >  
> > 
> > 
> > 
> >  
> >  
> >   
> >
> > this is of course less flexible, and I'm not entirely serious about
> > suggesting this, but its an option that exists none the less.
> 
> Eww. This is disgusting and borderline useless if you ever want to
> modify the backing image, but it certainly can be achieved with multiple
> 'raw' format drivers.

I agree that this is too limiting.

> 
> I don't think the NVMe standard mandates that the memory backing the
> namespace must be the same for all namespaces.

The NVMe spec says:

"A namespace is a quantity of non-volatile memory that may be formatted into
logical blocks." (v1.4)

So we can pretty much do whatever we want. Having a single NVMe controller
through which we can pass all disks to a VM can be useful because it simplifies
management and reduces resource consumption both in the guest and the host. But
we can definitely add as many controllers as we want should we need to.

> 
> For a less disgusting and more usable setup, the namespace element can
> be a collection of  elements.
> 
> The above also will require use of virDomainUpdateDevice if you'd want
> to change the backing store in any way since that's possible.




RE: Libvirt NVME support

2020-11-23 Thread Thanos Makatos



> -Original Message-
> From: Peter Krempa 
> Sent: 23 November 2020 16:56
> To: Thanos Makatos 
> Cc: Suraj Kasi ; libvirt-l...@redhat.com; John Levon
> 
> Subject: Re: Libvirt NVME support
> 
> On Mon, Nov 23, 2020 at 16:48:55 +, Thanos Makatos wrote:
> > > On Mon, Nov 23, 2020 at 13:07:51 +, Thanos Makatos wrote:
> > > >
> > > > > On Mon, Nov 23, 2020 at 09:47:23 +, Thanos Makatos wrote:
> > > > > > > On Thu, Nov 19, 2020 at 10:17:56 +, Thanos Makatos wrote:
> 
> > > > > >
> > > > > > Revistiting your initial suggestion, it should be something like 
> > > > > > this
> > > > > > (s/sdb/nvme0):
> > > > > >
> > > > > > 
> > > > > >   
> > > > > >   
> > > > > >> > > > > unit='1'/>
> > > > > > 
> > > > >
> > > > > Note that the parser for 'dev' is a bit quirky, old, and used in many
> > > > > places besides the qemu driver. It's also used with numbers in non-
> qemu
> > > > > cases. Extending that to parse numbers for nvme but not for sda
> might
> > > > > become ugly very quickly. Sticking with a letter at the end ('nvmea'
> > > > > might be a more straightforward approach.
> > > >
> > > > Then I think we should just stick with 'nvme'.
> > >
> > > You still need a way to "index" it somehow. The target must be unique
> > > for each disk.
> >
> > I think I've misunderstood something, I thought controller='1' in  ...>
> > refers to index='1' in . So  should be:
> >
> > 
> >
> > What's controller='1' then?
> 
> What I meant by the above is that the value of " be unique for every . I also wanted to advice to not use numbers
> for making it unique. Numbers used for it have a legacy meaning.
> 
> Your suggested 

and






Re: Libvirt NVME support

2020-11-23 Thread Peter Krempa
On Mon, Nov 23, 2020 at 17:40:58 +, Thanos Makatos wrote:
> 
> 
> > -Original Message-
> > From: Peter Krempa 
> > Sent: 23 November 2020 16:56
> > To: Thanos Makatos 
> > Cc: Suraj Kasi ; libvirt-l...@redhat.com; John Levon
> > 
> > Subject: Re: Libvirt NVME support
> > 
> > On Mon, Nov 23, 2020 at 16:48:55 +, Thanos Makatos wrote:
> > > > On Mon, Nov 23, 2020 at 13:07:51 +, Thanos Makatos wrote:
> > > > >
> > > > > > On Mon, Nov 23, 2020 at 09:47:23 +, Thanos Makatos wrote:
> > > > > > > > On Thu, Nov 19, 2020 at 10:17:56 +, Thanos Makatos wrote:
> > 
> > > > > > >
> > > > > > > Revistiting your initial suggestion, it should be something like 
> > > > > > > this
> > > > > > > (s/sdb/nvme0):
> > > > > > >
> > > > > > > 
> > > > > > >   
> > > > > > >   
> > > > > > >> > > > > > unit='1'/>
> > > > > > > 
> > > > > >
> > > > > > Note that the parser for 'dev' is a bit quirky, old, and used in 
> > > > > > many
> > > > > > places besides the qemu driver. It's also used with numbers in non-
> > qemu
> > > > > > cases. Extending that to parse numbers for nvme but not for sda
> > might
> > > > > > become ugly very quickly. Sticking with a letter at the end ('nvmea'
> > > > > > might be a more straightforward approach.
> > > > >
> > > > > Then I think we should just stick with 'nvme'.
> > > >
> > > > You still need a way to "index" it somehow. The target must be unique
> > > > for each disk.
> > >
> > > I think I've misunderstood something, I thought controller='1' in  > ...>
> > > refers to index='1' in . So  should be:
> > >
> > > 
> > >
> > > What's controller='1' then?
> > 
> > What I meant by the above is that the value of " > be unique for every . I also wanted to advice to not use numbers
> > for making it unique. Numbers used for it have a legacy meaning.
> > 
> > Your suggested  
> OK, so we definitely need dev='...' in target which is something like 
> [a-zA-Z]+
> and is unique. If this identifier is not controlled by the user, I think it
> would be best not to prefix it with 'nvme' (thus resulting in strings like
> 'nvmea' or 'nvmeabc'), as they can be rather confusing for people who don't
> know the details.
> 
> However, I still don't understand how index='1' and controller='1' in address
> relate to index='1' in controller:
> 
> 

index should not be here at all ..


> 
> and
> 
> 
> 
> 

... then it makes sense.



RE: Libvirt NVME support

2020-11-23 Thread Thanos Makatos



> -Original Message-
> From: Peter Krempa 
> Sent: 23 November 2020 17:47
> To: Thanos Makatos 
> Cc: Suraj Kasi ; libvirt-l...@redhat.com; John Levon
> 
> Subject: Re: Libvirt NVME support
> 
> On Mon, Nov 23, 2020 at 17:40:58 +, Thanos Makatos wrote:
> >
> >
> > > -Original Message-
> > > From: Peter Krempa 
> > > Sent: 23 November 2020 16:56
> > > To: Thanos Makatos 
> > > Cc: Suraj Kasi ; libvirt-l...@redhat.com; John
> Levon
> > > 
> > > Subject: Re: Libvirt NVME support
> > >
> > > On Mon, Nov 23, 2020 at 16:48:55 +, Thanos Makatos wrote:
> > > > > On Mon, Nov 23, 2020 at 13:07:51 +, Thanos Makatos wrote:
> > > > > >
> > > > > > > On Mon, Nov 23, 2020 at 09:47:23 +, Thanos Makatos wrote:
> > > > > > > > > On Thu, Nov 19, 2020 at 10:17:56 +, Thanos Makatos
> wrote:
> > >
> > > > > > > >
> > > > > > > > Revistiting your initial suggestion, it should be something 
> > > > > > > > like this
> > > > > > > > (s/sdb/nvme0):
> > > > > > > >
> > > > > > > > 
> > > > > > > >   
> > > > > > > >   
> > > > > > > >unit='1'/>
> > > > > > > > 
> > > > > > >
> > > > > > > Note that the parser for 'dev' is a bit quirky, old, and used in 
> > > > > > > many
> > > > > > > places besides the qemu driver. It's also used with numbers in
> non-
> > > qemu
> > > > > > > cases. Extending that to parse numbers for nvme but not for sda
> > > might
> > > > > > > become ugly very quickly. Sticking with a letter at the end
> ('nvmea'
> > > > > > > might be a more straightforward approach.
> > > > > >
> > > > > > Then I think we should just stick with 'nvme'.
> > > > >
> > > > > You still need a way to "index" it somehow. The target must be
> unique
> > > > > for each disk.
> > > >
> > > > I think I've misunderstood something, I thought controller='1' in
>  > > ...>
> > > > refers to index='1' in . So  should be:
> > > >
> > > > 
> > > >
> > > > What's controller='1' then?
> > >
> > > What I meant by the above is that the value of " > > be unique for every . I also wanted to advice to not use numbers
> > > for making it unique. Numbers used for it have a legacy meaning.
> > >
> > > Your suggested  >
> > OK, so we definitely need dev='...' in target which is something like [a-zA-
> Z]+
> > and is unique. If this identifier is not controlled by the user, I think it
> > would be best not to prefix it with 'nvme' (thus resulting in strings like
> > 'nvmea' or 'nvmeabc'), as they can be rather confusing for people who
> don't
> > know the details.
> >
> > However, I still don't understand how index='1' and controller='1' in
> address
> > relate to index='1' in controller:
> >
> > 
> 
> index should not be here at all ..
> 
> 
> >
> > and
> >
> > 
> >
> >
> 
> ... then it makes sense.

Thanks, it makes perfect sense now.