genegr opened a new pull request, #13061:
URL: https://github.com/apache/cloudstack/pull/13061

   ### Description
   
   Adds an end-to-end NVMe-over-TCP data path for CloudStack on KVM, using the 
FlashArray adaptive plugin as the first (and currently only) consumer. The 
change is opt-in — existing Fibre Channel FlashArray / Primera deployments 
continue to work unchanged.
   
   A FlashArray pool is switched to NVMe-TCP by adding a single 
`transport=nvme-tcp` query parameter to the pool URL on `createStoragePool`:
   
   ```
   
url=https://<user>:<pass>@<fa-ip>:443/api?pod=<pod>&transport=nvme-tcp&hostgroup=<hg>
   ```
   
   When that parameter is present the adaptive lifecycle stamps the pool with 
the new `StoragePoolType.NVMeTCP`, the KVM agent dispatches to a brand-new 
`MultipathNVMeOFAdapterBase` / `NVMeTCPAdapter` pair, and the FlashArray 
adapter attaches volumes as host-group-scoped NVMe connections, builds EUI-128 
NGUIDs in the layout `/dev/disk/by-id/nvme-eui.<32-hex>` that udev emits for a 
Pure namespace, and reverses that layout when CloudStack looks up a volume by 
address.
   
   The six commits are split along natural seams (address type, FA REST-side 
support, storage pool type, KVM adapter, adaptive lifecycle routing, docs) so 
each can be reviewed independently.
   
   Why a separate `NVMeTCP` pool type (and a separate 
`MultipathNVMeOFAdapterBase`) rather than reusing `FiberChannel` / 
`MultipathSCSIAdapterBase`?
   
   - NVMe-oF is a different command set (NVMe, not SCSI), identifies namespaces 
by EUI-128 NGUIDs (not SCSI WWNs), and on Linux is multipathed natively by the 
`nvme` driver rather than by device-mapper multipath. Keeping it out of the 
SCSI code path avoids special-casing inside every method that handles paths, 
connect, disconnect, or size lookup.
   - The new base class is fabric-agnostic: a future NVMe-RoCE or NVMe-FC 
adapter would only need a concrete subclass and a new pool-type value, without 
touching the SCSI code.
   
   ### Types of changes
   
   - [x] Enhancement (non-breaking change which adds functionality)
   - [ ] Bugfix
   - [ ] Breaking change
   
   ### Feature/Enhancement Scale or Bug Severity
   
   Feature. Opt-in via `transport=nvme-tcp` URL parameter on pool registration. 
Defaults are unchanged.
   
   ### How Has This Been Tested?
   
   Validated end-to-end on a 4.23-SNAPSHOT lab against a Pure Storage 
FlashArray running Purity 6.7.7:
   
   - Pre-requisites on each KVM host: an OVS bridge `cloudbr-nvme` with an IP 
on the NVMe subnet, `nvme-cli` + `nvme_tcp` kernel module, a persistent 
`/etc/nvme/hostnqn`, a populated `/etc/nvme/discovery.conf` and `nvme 
connect-all` enabled at boot.
   - Pre-requisites on the array: a pod (`cloudstack`), a hostgroup matching 
the CloudStack cluster name (`cluster1`), one host per KVM host inside the 
hostgroup bound to the host's NQN.
   - Registered a FlashArray primary pool with `provider="Flash Array"`, 
`transport=nvme-tcp`, `hostgroup=cluster1` → pool enters `Up` state, `type: 
NVMeTCP`.
   - Created and attached a 20 GiB `tags=nvme` disk offering volume to a Rocky 
9 VM: the volume's path carried `type=NVMETCP; address=<EUI-128>; 
connid.kvm01=1; connid.kvm02=1;`; both hosts saw 
`/dev/disk/by-id/nvme-eui.<that EUI>` via the host-group NVMe connection; 
libvirt presented the namespace to the guest as `/dev/vdb`.
   - Inside the guest: `mkfs.ext4 /dev/vdb`, wrote 16 MiB of `/dev/urandom` 
with `conv=fsync`, recorded SHA-256, unmounted/remounted, re-checksummed → hash 
matched.
   - Live-migrated the VM between the two KVM hosts while a `sha256sum` probe 
loop was running against `/mnt/nvme/pattern.bin` every 2 s. Migration completed 
in 6 s, the loop output showed the same hash across the migration window with 
no gap (multi-path/hostgroup-scope proof).
   - Default-path Fibre Channel registrations (no `transport=` parameter) 
continue to work — `type: FiberChannel`, FC WWN addressing, same 
`MultipathSCSIAdapterBase` code path as before.
   
   ### Notes
   
   - This PR stacks logically on top of #13050 (FlashArray empty-pod capacity 
fallback) for a clean empty-pod registration experience, but does **not** 
require it — `capacitybytes=` on `createStoragePool` is a workaround without 
#13050 merged. Another PR is pending to fix 
`AdaptiveDataStoreLifeCycleImpl.initialize()` to honor user-supplied capacity 
even when provider stats are unavailable.
   - Future work (not in this PR): implement `copyPhysicalDisk` on 
`MultipathNVMeOFAdapterBase` so NVMe-TCP pools can also host system/VR root 
disks. Today the adapter deliberately throws `UnsupportedOperationException` 
for `copyPhysicalDisk` and callers are expected to keep ROOT volumes on a 
conventional NFS/file pool and place only DATA volumes on the NVMe-TCP pool.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to