Public bug reported: Recently we have had customers reporting issues [1][2] using Nova with Cinder volumes (and seem to be sparse volumes) when the guest disk <driver> XML element sets attribute io=native. Such customers experienced reduced disk I/O performance or guest hanging with their NFS or Fibre Channel Cinder volumes and narrowed down the cause to the io=native attribute set in the guest XML by Nova.
The hard-coding io=native in guest XML in Nova for Cinder volume backends iSCSI, Fibre Channel, and NFS was added about 10 years ago to improve disk performance [3]. It seems that determination may no longer be accurate or at least it is not universally the case. QEMU has logic inside it that it uses to select the best AIO mode for the device at hand and from the aforementioned experiences, some deployers need to be able to let QEMU set the best AIO mode and not have Nova hard-code it. It's possible that the entire assumption needs to be revisited at a fundamental level given the amount of time that has passed since the hard-coding was added. QEMU may have had advancements since then and may even have access to more modern AIO modes such as io_uring as well. For the immediate term, we can add a [workarounds] config option to enable deployers to defer AIO mode selection to QEMU if they are having problems with io=native. For the long term, we will need to discuss the topic with the Cinder team to learn if there is something we need to change more unilaterally in Nova. [1] https://issues.redhat.com/browse/OSPRH-20325 [2] https://issues.redhat.com/browse/OSPRH-20737 [3] https://specs.openstack.org/openstack/nova-specs/specs/mitaka/implemented/libvirt-aio-mode.html ** Affects: nova Importance: Undecided Assignee: melanie witt (melwitt) Status: In Progress ** Tags: nfs volumes ** Description changed: Recently we have had customers reporting issues [1][2] using Nova with - Cinder volumes (and seem to be sparse volumes) when the disk <driver> - element sets attribute io=native. Such customers experienced reduced - disk I/O performance or guest hanging with their NFS or Fibre Channel - Cinder volumes and narrowed down the cause to the io=native attribute - set in the guest XML by Nova. + Cinder volumes (and seem to be sparse volumes) when the guest disk + <driver> XML element sets attribute io=native. Such customers + experienced reduced disk I/O performance or guest hanging with their NFS + or Fibre Channel Cinder volumes and narrowed down the cause to the + io=native attribute set in the guest XML by Nova. The hard-coding io=native in guest XML in Nova for Cinder volume backends iSCSI, Fibre Channel, and NFS was added about 10 years ago to improve disk performance [3]. It seems that determination may no longer be accurate or at least it is not universally the case. QEMU has logic inside it that it uses to select the best AIO mode for the device at hand and from the aforementioned experiences, some deployers need to be able to let QEMU set the best AIO mode and not have Nova hard-code it. It's possible that the entire assumption needs to be revisited at a fundamental level given the amount of time that has passed since the hard-coding was added. QEMU may have had advancements since then and may even have access to more modern AIO modes such as io_uring as well. For the immediate term, we can add a [workarounds] config option to enable deployers to defer AIO mode selection to QEMU if they are having problems with io=native. For the long term, we will need to discuss the topic with the Cinder team to learn if there is something we need to change more unilaterally in Nova. - [1] https://issues.redhat.com/browse/OSPRH-20325 [2] https://issues.redhat.com/browse/OSPRH-20737 [3] https://specs.openstack.org/openstack/nova-specs/specs/mitaka/implemented/libvirt-aio-mode.html -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/2129788 Title: Native AIO mode for Cinder volumes is not always appropriate Status in OpenStack Compute (nova): In Progress Bug description: Recently we have had customers reporting issues [1][2] using Nova with Cinder volumes (and seem to be sparse volumes) when the guest disk <driver> XML element sets attribute io=native. Such customers experienced reduced disk I/O performance or guest hanging with their NFS or Fibre Channel Cinder volumes and narrowed down the cause to the io=native attribute set in the guest XML by Nova. The hard-coding io=native in guest XML in Nova for Cinder volume backends iSCSI, Fibre Channel, and NFS was added about 10 years ago to improve disk performance [3]. It seems that determination may no longer be accurate or at least it is not universally the case. QEMU has logic inside it that it uses to select the best AIO mode for the device at hand and from the aforementioned experiences, some deployers need to be able to let QEMU set the best AIO mode and not have Nova hard-code it. It's possible that the entire assumption needs to be revisited at a fundamental level given the amount of time that has passed since the hard-coding was added. QEMU may have had advancements since then and may even have access to more modern AIO modes such as io_uring as well. For the immediate term, we can add a [workarounds] config option to enable deployers to defer AIO mode selection to QEMU if they are having problems with io=native. For the long term, we will need to discuss the topic with the Cinder team to learn if there is something we need to change more unilaterally in Nova. [1] https://issues.redhat.com/browse/OSPRH-20325 [2] https://issues.redhat.com/browse/OSPRH-20737 [3] https://specs.openstack.org/openstack/nova-specs/specs/mitaka/implemented/libvirt-aio-mode.html To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/2129788/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

