Since commit 7ff9ff039380 ("meson: mitigate against use of uninitialize
stack for exploits") the -ftrivial-auto-var-init=zero compiler option is
used to zero local variables. While this reduces security risks
associated with uninitialized stack data, it introduced a measurable
bottleneck in the virtqueue_split_pop() and virtqueue_packed_pop()
functions.
These virtqueue functions are in the hot path. They are called for each
element (request) that is popped from a VIRTIO device's virtqueue. Using
__attribute__((uninitialized)) on large stack variables in these
functions improves fio randread bs=4k iodepth=64 performance from 304k
to 332k IOPS (+9%).
This issue was found using perf-top(1). virtqueue_split_pop() was one of
the top CPU consumers and the "annotate" feature showed that the memory
zeroing instructions at the beginning of the functions were hot.
Fixes: 7ff9ff039380 ("meson: mitigate against use of uninitialize stack for
exploits")
Cc: Daniel P. Berrangé <[email protected]>
Signed-off-by: Stefan Hajnoczi <[email protected]>
---
include/qemu/compiler.h | 12 ++++++++++++
hw/virtio/virtio.c | 8 ++++----
2 files changed, 16 insertions(+), 4 deletions(-)
diff --git a/include/qemu/compiler.h b/include/qemu/compiler.h
index 496dac5ac1..fabd540b02 100644
--- a/include/qemu/compiler.h
+++ b/include/qemu/compiler.h
@@ -207,6 +207,18 @@
# define QEMU_USED
#endif
+/*
+ * Disable -ftrivial-auto-var-init on a local variable. Use this in rare cases
+ * when the compiler zeroes a large on-stack variable and this causes a
+ * performance bottleneck. Only use it when performance data indicates this is
+ * necessary since security risks increase with uninitialized stack variables.
+ */
+#if __has_attribute(uninitialized)
+# define QEMU_UNINITIALIZED __attribute__((uninitialized))
+#else
+# define QEMU_UNINITIALIZED
+#endif
+
/*
* http://clang.llvm.org/docs/ThreadSafetyAnalysis.html
*
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 5534251e01..82a285a31d 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -1689,8 +1689,8 @@ static void *virtqueue_split_pop(VirtQueue *vq, size_t sz)
VirtIODevice *vdev = vq->vdev;
VirtQueueElement *elem = NULL;
unsigned out_num, in_num, elem_entries;
- hwaddr addr[VIRTQUEUE_MAX_SIZE];
- struct iovec iov[VIRTQUEUE_MAX_SIZE];
+ hwaddr QEMU_UNINITIALIZED addr[VIRTQUEUE_MAX_SIZE];
+ struct iovec QEMU_UNINITIALIZED iov[VIRTQUEUE_MAX_SIZE];
VRingDesc desc;
int rc;
@@ -1836,8 +1836,8 @@ static void *virtqueue_packed_pop(VirtQueue *vq, size_t
sz)
VirtIODevice *vdev = vq->vdev;
VirtQueueElement *elem = NULL;
unsigned out_num, in_num, elem_entries;
- hwaddr addr[VIRTQUEUE_MAX_SIZE];
- struct iovec iov[VIRTQUEUE_MAX_SIZE];
+ hwaddr QEMU_UNINITIALIZED addr[VIRTQUEUE_MAX_SIZE];
+ struct iovec QEMU_UNINITIALIZED iov[VIRTQUEUE_MAX_SIZE];
VRingPackedDesc desc;
uint16_t id;
int rc;
--
2.49.0