On 24/02, Nir Soffer wrote:
> On Wed, Feb 23, 2022 at 6:24 PM Muli Ben-Yehuda <m...@lightbitslabs.com> 
> wrote:
> >
> > Thanks for the detailed instructions, Nir. I'm going to scrounge up some 
> > hardware.
> > By the way, if anyone else would like to work on NVMe/TCP support, for 
> > NVMe/TCP target you can either use Lightbits (talk to me offline for 
> > details) or use the upstream Linux NVMe/TCP target. Lightbits is a 
> > clustered storage system while upstream is a single target, but the client 
> > side should be close enough for vdsm/ovirt purposes.
>
> I played with NVMe/TCP a little bit, using qemu to create a virtual
> NVMe disk, and export
> it using the kernel on one VM, and consume it on another VM.
> https://futurewei-cloud.github.io/ARM-Datacenter/qemu/nvme-of-tcp-vms/
>

Hi,

You can also use nvmetcli to create nvme-of devices using the kernel's
nvmet.

I haven't tested any cinder NVMe driver with cinderlib yet, but I'll
test it with the LVM driver and nvmet target, since I'm currently
working on improvements/fixes on both the nvmet target and the os-brick
connector.

I have played with both iSCSI and RDMA (using Soft-RoCE) as transport
protocols for NVMe-oF and they worked fine in OpenStack.

Something important to consider when thinking about making it enterprise
ready is that the NVMe-oF connector in os-brick doesn't currently
support any kind of multipathing: native (ANA) or using device mapper.
But it's something we'll be working on.

I'll let you know how the cinderlib testing goes, though I already know
that the LVM with nvmet has problems in the disconnection [1].

[1]: https://bugs.launchpad.net/os-brick/+bug/1961102

> One question about device naming - do we always get the same name of the
> device in all hosts?

Definitely not, depending on the transport protocol used and the
features enabled (such as multipathing), os-brick will return a
different path to the device.

In the case of nvme-of it will return devices like /dev/nvme0n1 ==> This
means controller 0 and namespace 1 in the nvme host system.

And the namespace 1 in the system can actually have a different
namespace id (for example 10).  Example from a test system using LVM and
a nvmet target variant I'm working on:

  $ sudo nvme list
  Node           SN                 Model   Namespace Usage                     
Format           FW Rev
  -------------- ------------------ ------- --------- ------------------------- 
---------------- --------
  /dev/nvme0n1   9a9bd17b53e6725f   Linux   11          1.07  GB /   1.07  GB   
512   B +  0 B   4.18.0-2
  /dev/nvme0n2   9a9bd17b53e6725f   Linux   10          1.07  GB /   1.07  GB   
512   B +  0 B   4.18.0-2


>
> To support VM migration, every device must have unique name in the cluster.
> With multipath we always have unique name, since we disable "friendly names",
> so we always have:
>
>     /dev/mapper/{wwid}
>
> With rbd we also do not use /dev/rbdN but a unique path:
>
>     /dev/rbd/poolname/volume-vol-id
>
> How do we ensure cluster-unique device path? If os_brick does not handle it, 
> we
> can to do in ovirt, for example:
>
>     /run/vdsm/mangedvolumes/{uuid} -> /dev/nvme7n42
>

os-brick will not handle this, but assuming udev rules are working
consistently in both migration systems (source and target) there will be
a symlink in /dev/disk/by-id that is formed using the NVMe UUID of the
volume.

In the example above we have:

  $ ls -l /dev/disk/by-id/nvme*
  lrwxrwxrwx. 1 root root 13 Feb 24 16:30 
/dev/disk/by-id/nvme-Linux_9a9bd17b53e6725f -> ../../nvme0n2
  lrwxrwxrwx. 1 root root 13 Feb 24 16:30 
/dev/disk/by-id/nvme-uuid.5310ef24-8301-4e38-a8b8-b61cd61d8b36 -> ../../nvme0n2
  lrwxrwxrwx. 1 root root 13 Feb 24 16:30 
/dev/disk/by-id/nvme-uuid.e31b8c9c-b943-430e-afa4-55a110341dcb -> ../../nvme0n1

The uuid may not be the volume uuid, it will depend on the cinder
driver, but we can find the uuid for the specific nvme device easily
enough:

  $ cat /sys/class/nvme/nvme0/nvme0n2/wwid
  uuid.5310ef24-8301-4e38-a8b8-b61cd61d8b36


> but I think this should be handled in cinderlib, since openstack have
> the same problem
> with migration.

OpenStack doesn't have that problem with migrations.

In OpenStack we don't care where the device appears, because nova knows
the volume id of the volume before calling os-brick to connect to it,
and then when os-brick returns the path it knows it belongs to that
specific volume.

Cheers,
Gorka.

>
> Nir
>
> >
> > Cheers,
> > Muli
> > --
> > Muli Ben-Yehuda
> > Co-Founder and Chief Scientist @ http://www.lightbitslabs.com
> > LightOS: The Special Storage Sauce For Your Cloud
> >
> >
> > On Wed, Feb 23, 2022 at 4:55 PM Nir Soffer <nsof...@redhat.com> wrote:
> >>
> >> On Wed, Feb 23, 2022 at 4:20 PM Muli Ben-Yehuda <m...@lightbitslabs.com> 
> >> wrote:
> >> >
> >> > Thanks, Nir and Benny (nice to run into you again, Nir!). I'm a neophyte 
> >> > in ovirt and vdsm... What's the simplest way to set up a development 
> >> > environment? Is it possible to set up a "standalone" vdsm environment to 
> >> > hack support for nvme/tcp or do I need "full ovirt" to make it work?
> >>
> >> It should be possible to install vdsm on a single host or vm, and use vdsm
> >> API to bring the host to the right state, and then attach devices and run
> >> vms. But I don't know anyone that can pull this out since simulating what
> >> engine is doing is hard.
> >>
> >> So the best way is to set up at least one host and engine host using the
> >> latest 4.5 rpms, and continue from there. Once you have a host, building
> >> vdsm on the host and upgrading the rpms is pretty easy.
> >>
> >> My preferred setup is to create vms using virt-manager for hosts, engine
> >> and storage and run all the vms on my laptop.
> >>
> >> Note that you must have some traditional storage (NFS/iSCSI) to bring up
> >> the system even if you plan to use only managed block storage (MBS).
> >> Unfortunately when we add MBS support we did have time to fix the huge
> >> technical debt so you still need a master storage domain using one of the
> >> traditional legacy options.
> >>
> >> To build a setup, you can use:
> >>
> >> - engine vm: 6g ram, 2 cpus, centos stream 8
> >> - hosts vm: 4g ram, 2 cpus, centos stream 8
> >>   you can start with one host and add more hosts later if you want to
> >> test migration.
> >> - storage vm: 2g ram, 2 cpus, any os you like, I use alpine since it
> >> takes very little
> >>   memory and its NFS server is fast.
> >>
> >> See vdsm README for instructions how to setup a host:
> >> https://github.com/oVirt/vdsm#manual-installation
> >>
> >> For engine host you can follow:
> >> https://ovirt.org/documentation/installing_ovirt_as_a_self-hosted_engine_using_the_command_line/#Enabling_the_Red_Hat_Virtualization_Manager_Repositories_install_RHVM
> >>
> >> And after that this should work:
> >>
> >>     dnf install ovirt-engine
> >>     engine-setup
> >>
> >> Accepting all the defaults should work.
> >>
> >> When you have engine running, you can add a new host with
> >> the ip address or dns name of you host(s) vm, and engine will
> >> do everything for you. Note that you must install the ovirt-release-master
> >> rpm on the host before you add it to engine.
> >>
> >> Nir
> >>
> >> >
> >> > Cheers,
> >> > Muli
> >> > --
> >> > Muli Ben-Yehuda
> >> > Co-Founder and Chief Scientist @ http://www.lightbitslabs.com
> >> > LightOS: The Special Storage Sauce For Your Cloud
> >> >
> >> >
> >> > On Wed, Feb 23, 2022 at 4:16 PM Nir Soffer <nsof...@redhat.com> wrote:
> >> >>
> >> >> On Wed, Feb 23, 2022 at 2:48 PM Benny Zlotnik <bzlot...@redhat.com> 
> >> >> wrote:
> >> >> >
> >> >> > So I started looking in the logs and tried to follow along with the
> >> >> > code, but things didn't make sense and then I saw it's ovirt 4.3 which
> >> >> > makes things more complicated :)
> >> >> > Unfortunately because GUID is sent in the metadata the volume is
> >> >> > treated as a vdsm managed volume[2] for the udev rule generation and
> >> >> > it prepends the /dev/mapper prefix to an empty string as a result.
> >> >> > I don't have the vdsm logs, so I am not sure where exactly this fails,
> >> >> > but if it's after [4] it may be possible to workaround it with a vdsm
> >> >> > hook
> >> >> >
> >> >> > In 4.4.6 we moved the udev rule triggering the volume mapping phase,
> >> >> > before starting the VM. But it could still not work because we check
> >> >> > the driver_volume_type in[1], and I saw it's "driver_volume_type":
> >> >> > "lightos" for lightbits
> >> >> > In theory it looks like it wouldn't take much to add support for your
> >> >> > driver in a future release (as it's pretty late for 4.5)
> >> >>
> >> >> Adding support for nvme/tcp in 4.3 is probably not feasible, but we will
> >> >> be happy to accept patches for 4.5.
> >> >>
> >> >> To debug such issues vdsm log is the best place to check. We should see
> >> >> the connection info passed to vdsm, and we have pretty simple code using
> >> >> it with os_brick to attach the device to the system and setting up the 
> >> >> udev
> >> >> rule (which may need some tweaks).
> >> >>
> >> >> Nir
> >> >>
> >> >> > [1] 
> >> >> > https://github.com/oVirt/vdsm/blob/500c035903dd35180d71c97791e0ce4356fb77ad/lib/vdsm/storage/managedvolume.py#L110
> >> >> >
> >> >> > (4.3)
> >> >> > [2] 
> >> >> > https://github.com/oVirt/vdsm/blob/b42d4a816b538e00ea4955576a5fe762367be787/lib/vdsm/clientIF.py#L451
> >> >> > [3] 
> >> >> > https://github.com/oVirt/vdsm/blob/b42d4a816b538e00ea4955576a5fe762367be787/lib/vdsm/storage/hsm.py#L3141
> >> >> > [4] 
> >> >> > https://github.com/oVirt/vdsm/blob/b42d4a816b538e00ea4955576a5fe762367be787/lib/vdsm/virt/vm.py#L3835
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> >> >> > On Wed, Feb 23, 2022 at 12:44 PM Muli Ben-Yehuda 
> >> >> > <m...@lightbitslabs.com> wrote:
> >> >> > >
> >> >> > > Certainly, thanks for your help!
> >> >> > > I put cinderlib and engine.log here: 
> >> >> > > http://www.mulix.org/misc/ovirt-logs-20220223123641.tar.gz
> >> >> > > If you grep for 'mulivm1' you will see for example:
> >> >> > >
> >> >> > > 2022-02-22 04:31:04,473-05 ERROR 
> >> >> > > [org.ovirt.engine.core.vdsbroker.vdsbroker.HotPlugDiskVDSCommand] 
> >> >> > > (default task-10) [36d8a122] Command 
> >> >> > > 'HotPlugDiskVDSCommand(HostName = client1, 
> >> >> > > HotPlugDiskVDSParameters:{hostId='fc5c2860-36b1-4213-843f-10ca7b35556c',
> >> >> > >  vmId='e13f73a0-8e20-4ec3-837f-aeacc082c7aa', 
> >> >> > > diskId='d1e1286b-38cc-4d56-9d4e-f331ffbe830f', addressMap='[bus=0, 
> >> >> > > controller=0, unit=2, type=drive, target=0]'})' execution failed: 
> >> >> > > VDSGenericException: VDSErrorException: Failed to HotPlugDiskVDS, 
> >> >> > > error = Failed to bind /dev/mapper/ on to 
> >> >> > > /var/run/libvirt/qemu/21-mulivm1.mapper.: Not a directory, code = 45
> >> >> > >
> >> >> > > Please let me know what other information will be useful and I will 
> >> >> > > prove.
> >> >> > >
> >> >> > > Cheers,
> >> >> > > Muli
> >> >> > >
> >> >> > > On Wed, Feb 23, 2022 at 11:14 AM Benny Zlotnik 
> >> >> > > <bzlot...@redhat.com> wrote:
> >> >> > >>
> >> >> > >> Hi,
> >> >> > >>
> >> >> > >> We haven't tested this, and we do not have any code to handle 
> >> >> > >> nvme/tcp
> >> >> > >> drivers, only iscsi and rbd. Given the path seen in the logs
> >> >> > >> '/dev/mapper', it looks like it might require code changes to 
> >> >> > >> support
> >> >> > >> this.
> >> >> > >> Can you share cinderlib[1] and engine logs to see what is returned 
> >> >> > >> by
> >> >> > >> the driver? I may be able to estimate what would be required (it's
> >> >> > >> possible that it would be enough to just change the handling of the
> >> >> > >> path in the engine)
> >> >> > >>
> >> >> > >> [1] /var/log/ovirt-engine/cinderlib/cinderlib//log
> >> >> > >>
> >> >> > >> On Wed, Feb 23, 2022 at 10:54 AM <m...@lightbitslabs.com> wrote:
> >> >> > >> >
> >> >> > >> > Hi everyone,
> >> >> > >> >
> >> >> > >> > We are trying to set up ovirt (4.3.10 at the moment, customer 
> >> >> > >> > preference) to use Lightbits (https://www.lightbitslabs.com) 
> >> >> > >> > storage via our openstack cinder driver with cinderlib. The 
> >> >> > >> > cinderlib and cinder driver bits are working fine but when ovirt 
> >> >> > >> > tries to attach the device to a VM we get the following error:
> >> >> > >> >
> >> >> > >> > libvirt:  error : cannot create file 
> >> >> > >> > '/var/run/libvirt/qemu/18-mulivm1.dev/mapper/': Is a directory
> >> >> > >> >
> >> >> > >> > We get the same error regardless of whether I try to run the VM 
> >> >> > >> > or try to attach the device while it is running. The error 
> >> >> > >> > appears to come from vdsm which passes /dev/mapper as the 
> >> >> > >> > prefered device?
> >> >> > >> >
> >> >> > >> > 2022-02-22 09:50:11,848-0500 INFO  (vm/3ae7dcf4) [vdsm.api] 
> >> >> > >> > FINISH appropriateDevice return={'path': '/dev/mapper/', 
> >> >> > >> > 'truesize': '53687091200', 'apparentsize': '53687091200'} 
> >> >> > >> > from=internal, task_id=77f40c4e-733d-4d82-b418-aaeb6b912d39 
> >> >> > >> > (api:54)
> >> >> > >> > 2022-02-22 09:50:11,849-0500 INFO  (vm/3ae7dcf4) [vds] prepared 
> >> >> > >> > volume path: /dev/mapper/ (clientIF:510)
> >> >> > >> >
> >> >> > >> > Suggestions for how to debug this further? Is this a known 
> >> >> > >> > issue? Did anyone get nvme/tcp storage working with ovirt and/or 
> >> >> > >> > vdsm?
> >> >> > >> >
> >> >> > >> > Thanks,
> >> >> > >> > Muli
> >> >> > >> >
> >> >> > >> > _______________________________________________
> >> >> > >> > Users mailing list -- us...@ovirt.org
> >> >> > >> > To unsubscribe send an email to users-le...@ovirt.org
> >> >> > >> > Privacy Statement: https://www.ovirt.org/privacy-policy.html
> >> >> > >> > oVirt Code of Conduct: 
> >> >> > >> > https://www.ovirt.org/community/about/community-guidelines/
> >> >> > >> > List Archives: 
> >> >> > >> > https://lists.ovirt.org/archives/list/us...@ovirt.org/message/I3PAG5HMBHUOJYPAI5ES3JHG6HCC3S6N/
> >> >> > >>
> >> >> > >
> >> >> > > Lightbits Labs
> >> >> > > Lead the cloud-native data center transformation by delivering 
> >> >> > > scalable and efficient software defined storage that is easy to 
> >> >> > > consume.
> >> >> > >
> >> >> > > This message is sent in confidence for the addressee only.  It may 
> >> >> > > contain legally privileged information. The contents are not to be 
> >> >> > > disclosed to anyone other than the addressee. Unauthorized 
> >> >> > > recipients are requested to preserve this confidentiality, advise 
> >> >> > > the sender immediately of any error in transmission and delete the 
> >> >> > > email from their systems.
> >> >> > >
> >> >> > >
> >> >> > _______________________________________________
> >> >> > Users mailing list -- us...@ovirt.org
> >> >> > To unsubscribe send an email to users-le...@ovirt.org
> >> >> > Privacy Statement: https://www.ovirt.org/privacy-policy.html
> >> >> > oVirt Code of Conduct: 
> >> >> > https://www.ovirt.org/community/about/community-guidelines/
> >> >> > List Archives: 
> >> >> > https://lists.ovirt.org/archives/list/us...@ovirt.org/message/DKFOCYQA6E4N3YU65NB3NAESGFX5QHDF/
> >> >>
> >> >
> >> > Lightbits Labs
> >> > Lead the cloud-native data center transformation by delivering scalable 
> >> > and efficient software defined storage that is easy to consume.
> >> >
> >> > This message is sent in confidence for the addressee only.  It may 
> >> > contain legally privileged information. The contents are not to be 
> >> > disclosed to anyone other than the addressee. Unauthorized recipients 
> >> > are requested to preserve this confidentiality, advise the sender 
> >> > immediately of any error in transmission and delete the email from their 
> >> > systems.
> >> >
> >> >
> >>
> >
> > Lightbits Labs
> > Lead the cloud-native data center transformation by delivering scalable and 
> > efficient software defined storage that is easy to consume.
> >
> > This message is sent in confidence for the addressee only.  It may contain 
> > legally privileged information. The contents are not to be disclosed to 
> > anyone other than the addressee. Unauthorized recipients are requested to 
> > preserve this confidentiality, advise the sender immediately of any error 
> > in transmission and delete the email from their systems.
> >
> >
>
_______________________________________________
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/O2QIACDF3Q7IYISMYRRSRFL2VG7GMYSQ/

Reply via email to