Public bug reported: Description ===========
[This was initially reported by a Red Hat OSP customer.] The I/O latency of a Cinder volume after live migration of an instance to which it's attached increases significantly. This stays increased till the VM is stopped and started again. [VM is booted with Cinder volume. This is not the case when using a disk from a Nova store backend [ without Cinder volume] -- or at least the difference isn't so significantly high after a live migration. The storage backend is Ceph 2.0. How reproducible: Consistently Steps to Reproduce ================== (0) Both the Nova instances and Cinder volumes are located on Ceph (1) Create a Nova instance with a Cinder volume attached to it (2) Live migrate it to a target Compute node (3) Run `ioping` (`ioping -c 10 .`) on the Cinder volume. Alternatively, run other I/O benchmarks like using `fio` with 'direct=1' (which uses non-bufferred I/O) as a good sanity check to get a second opinion regarding latency. Actual result ============= Before live migration: `ioping` output on the Cinder volume attached to a Nova instance: [guest]$ sudo ioping -c 10 . 4 KiB <<< . (xfs /dev/sda1): request=1 time=98.0 us (warmup) 4 KiB <<< . (xfs /dev/sda1): request=2 time=135.6 us 4 KiB <<< . (xfs /dev/sda1): request=3 time=155.5 us 4 KiB <<< . (xfs /dev/sda1): request=4 time=161.7 us 4 KiB <<< . (xfs /dev/sda1): request=5 time=148.4 us 4 KiB <<< . (xfs /dev/sda1): request=6 time=354.3 us 4 KiB <<< . (xfs /dev/sda1): request=7 time=138.0 us (fast) 4 KiB <<< . (xfs /dev/sda1): request=8 time=150.7 us 4 KiB <<< . (xfs /dev/sda1): request=9 time=149.6 us 4 KiB <<< . (xfs /dev/sda1): request=10 time=138.6 us (fast) --- . (xfs /dev/sda1) ioping statistics --- 9 requests completed in 1.53 ms, 36 KiB read, 5.87 k iops, 22.9 MiB/s generated 10 requests in 9.00 s, 40 KiB, 1 iops, 4.44 KiB/s min/avg/max/mdev = 135.6 us / 170.3 us / 354.3 us / 65.6 us After live migration, `ioping` output on the Cinder [guest]$ sudo ioping -c 10 . 4 KiB <<< . (xfs /dev/sda1): request=1 time=1.03 ms (warmup) 4 KiB <<< . (xfs /dev/sda1): request=2 time=948.6 us 4 KiB <<< . (xfs /dev/sda1): request=3 time=955.7 us 4 KiB <<< . (xfs /dev/sda1): request=4 time=920.5 us 4 KiB <<< . (xfs /dev/sda1): request=5 time=1.03 ms 4 KiB <<< . (xfs /dev/sda1): request=6 time=838.2 us 4 KiB <<< . (xfs /dev/sda1): request=7 time=1.13 ms (slow) 4 KiB <<< . (xfs /dev/sda1): request=8 time=868.6 us 4 KiB <<< . (xfs /dev/sda1): request=9 time=985.2 us 4 KiB <<< . (xfs /dev/sda1): request=10 time=936.6 us --- . (xfs /dev/sda1) ioping statistics --- 9 requests completed in 8.61 ms, 36 KiB read, 1.04 k iops, 4.08 MiB/s generated 10 requests in 9.00 s, 40 KiB, 1 iops, 4.44 KiB/s min/avg/max/mdev = 838.2 us / 956.9 us / 1.13 ms / 81.0 us This goes back to an average of 200us again after shutting down and starting up the instance. Expected result =============== No I/O latency experienced on Cinder volumes. ** Affects: nova Importance: Undecided Assignee: Kashyap Chamarthy (kashyapc) Status: In Progress ** Changed in: nova Assignee: (unassigned) => Kashyap Chamarthy (kashyapc) ** Changed in: nova Status: New => Confirmed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1706083 Title: Post-migration, Cinder volumes lose disk cache value resulting in I/O latency Status in OpenStack Compute (nova): In Progress Bug description: Description =========== [This was initially reported by a Red Hat OSP customer.] The I/O latency of a Cinder volume after live migration of an instance to which it's attached increases significantly. This stays increased till the VM is stopped and started again. [VM is booted with Cinder volume. This is not the case when using a disk from a Nova store backend [ without Cinder volume] -- or at least the difference isn't so significantly high after a live migration. The storage backend is Ceph 2.0. How reproducible: Consistently Steps to Reproduce ================== (0) Both the Nova instances and Cinder volumes are located on Ceph (1) Create a Nova instance with a Cinder volume attached to it (2) Live migrate it to a target Compute node (3) Run `ioping` (`ioping -c 10 .`) on the Cinder volume. Alternatively, run other I/O benchmarks like using `fio` with 'direct=1' (which uses non-bufferred I/O) as a good sanity check to get a second opinion regarding latency. Actual result ============= Before live migration: `ioping` output on the Cinder volume attached to a Nova instance: [guest]$ sudo ioping -c 10 . 4 KiB <<< . (xfs /dev/sda1): request=1 time=98.0 us (warmup) 4 KiB <<< . (xfs /dev/sda1): request=2 time=135.6 us 4 KiB <<< . (xfs /dev/sda1): request=3 time=155.5 us 4 KiB <<< . (xfs /dev/sda1): request=4 time=161.7 us 4 KiB <<< . (xfs /dev/sda1): request=5 time=148.4 us 4 KiB <<< . (xfs /dev/sda1): request=6 time=354.3 us 4 KiB <<< . (xfs /dev/sda1): request=7 time=138.0 us (fast) 4 KiB <<< . (xfs /dev/sda1): request=8 time=150.7 us 4 KiB <<< . (xfs /dev/sda1): request=9 time=149.6 us 4 KiB <<< . (xfs /dev/sda1): request=10 time=138.6 us (fast) --- . (xfs /dev/sda1) ioping statistics --- 9 requests completed in 1.53 ms, 36 KiB read, 5.87 k iops, 22.9 MiB/s generated 10 requests in 9.00 s, 40 KiB, 1 iops, 4.44 KiB/s min/avg/max/mdev = 135.6 us / 170.3 us / 354.3 us / 65.6 us After live migration, `ioping` output on the Cinder [guest]$ sudo ioping -c 10 . 4 KiB <<< . (xfs /dev/sda1): request=1 time=1.03 ms (warmup) 4 KiB <<< . (xfs /dev/sda1): request=2 time=948.6 us 4 KiB <<< . (xfs /dev/sda1): request=3 time=955.7 us 4 KiB <<< . (xfs /dev/sda1): request=4 time=920.5 us 4 KiB <<< . (xfs /dev/sda1): request=5 time=1.03 ms 4 KiB <<< . (xfs /dev/sda1): request=6 time=838.2 us 4 KiB <<< . (xfs /dev/sda1): request=7 time=1.13 ms (slow) 4 KiB <<< . (xfs /dev/sda1): request=8 time=868.6 us 4 KiB <<< . (xfs /dev/sda1): request=9 time=985.2 us 4 KiB <<< . (xfs /dev/sda1): request=10 time=936.6 us --- . (xfs /dev/sda1) ioping statistics --- 9 requests completed in 8.61 ms, 36 KiB read, 1.04 k iops, 4.08 MiB/s generated 10 requests in 9.00 s, 40 KiB, 1 iops, 4.44 KiB/s min/avg/max/mdev = 838.2 us / 956.9 us / 1.13 ms / 81.0 us This goes back to an average of 200us again after shutting down and starting up the instance. Expected result =============== No I/O latency experienced on Cinder volumes. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1706083/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp