Re: [PATCH v8 00/13] vhost-user: support any POSIX system (tested on macOS, FreeBSD, OpenBSD)

2024-07-05 Thread Stefano Garzarella

On Wed, Jul 03, 2024 at 06:49:30PM GMT, Michael S. Tsirkin wrote:

On Tue, Jun 18, 2024 at 12:00:30PM +0200, Stefano Garzarella wrote:

As discussed with Michael and Markus [1], this version also includes the patch
on which v7 depended to simplify the merge in Michael's tree.

The series is all reviewed, so if there are no new changes required, I would
ask to merge it.



I dropped patches 9 and 10 for now since otherwise make vm-build-freebsd
fails.

Pls figure it out and resend just 9 and 10.


I replicated locally, but I can't understand why it only happens in 
certain architectures, in my case on loongarch64, ppc64, and riscv32:


326/846 qemu:qtest+qtest-loongarch64 / qtest-loongarch64/qos-test
ERROR  116.10s   killed by signal 6 SIGABRT
337/846 qemu:qtest+qtest-ppc64 / qtest-ppc64/qos-test
ERROR  115.10s   killed by signal 6 SIGABRT
339/846 qemu:qtest+qtest-riscv32 / qtest-riscv32/qos-test
ERROR  107.65s   killed by signal 6 SIGABRT

I focused on ppc64 running `gmake --output-sync -j6 check-qtest-ppc64` 
in the FreeBSD VM and it fails every time. In particular, the test that 
fails is the `vhost-user/reconnect` test, in fact disabling it this way, 
the qos-test tests always pass:


diff --git a/tests/qtest/vhost-user-test.c b/tests/qtest/vhost-user-test.c
index 0fa8951c9f..c3d686f0ee 100644
--- a/tests/qtest/vhost-user-test.c
+++ b/tests/qtest/vhost-user-test.c
@@ -1118,9 +1119,11 @@ static void register_vhost_user_test(void)
  "virtio-net",
  test_migrate, );

+#if 0
 opts.before = vhost_user_test_setup_reconnect;
 qos_add_test("vhost-user/reconnect", "virtio-net",
  test_reconnect, );
+#endif

 opts.before = vhost_user_test_setup_connect_fail;
 qos_add_test("vhost-user/connect-fail", "virtio-net",

Analyzing the test, what happens is that after the disconnection, the 
test doesn't receive VHOST_USER_SET_MEM_TABLE message, so the second 
`wait_for_fds()` fails after the 5 sec timeout (increasing it doesn't 
help), not having received the fds.


diff --git a/tests/qtest/vhost-user-test.c b/tests/qtest/vhost-user-test.c
index 0fa8951c9f..c3d686f0ee 100644
--- a/tests/qtest/vhost-user-test.c
+++ b/tests/qtest/vhost-user-test.c
@@ -976,6 +976,7 @@ static void test_reconnect(void *obj, void *arg, 
QGuestAllocator *alloc)

 g_source_set_callback(src, reconnect_cb, s, NULL);
 g_source_attach(src, s->context);
 g_source_unref(src);
+// THIS one is failing
 g_assert(wait_for_fds(s));
 wait_for_rings_started(s, 2);
 }

This is the test log (note: IIUC QEMU failures happen after the test 
exits on the assertion, so so it could mean that the chardev reconnected 
correctly):


▶ 28/30 
/ppc64/pseries/spapr-pci-host-bridge/pci-bus-spapr/pci-bus/virtio-net-pci/virtio-net/virtio-net-tests/vhost-user/reconnect
 - ERROR:../src/tests/qtest/qos-test.c:191:subprocess_run_one_test: child 
process 
(/ppc64/pseries/spapr-pci-host-bridge/pci-bus-spapr/pci-bus/virtio-net-pci/virtio-net/virtio-net-tests/vhost-user/reconnect/subprocess
 [54991]) failed unexpectedly FAIL
▶ 28/30   
ERROR
[28-30/30]  qemu:qtest+qtest-ppc64 / qtest-ppc64/qmp-cmd-test  
 [28-30/30]  qemu:qtest+qtest-ppc64 / qtest-ppc64/migration-test   
  28/30 qemu:qtest+qtest-ppc64 / qtest-ppc64/qos-test   
ERROR   21.53s   killed by signal 6 SIGABRT
>>> PYTHON=/usr/home/qemu/qemu-test.OD8v2L/build/pyvenv/bin/python3.9 
ASAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1 
G_TEST_DBUS_DAEMON=/usr/home/qemu/qemu-test.OD8v2L/src/tests/dbus-vmstate-daemon.sh 
QTEST_QEMU_BINARY=./qemu-system-ppc64 MALLOC_PERTURB_=141 QTEST_QEMU_IMG=./qemu-img 
QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon 
UBSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 
/usr/home/qemu/qemu-test.OD8v2L/build/tests/qtest/qos-test --tap -k
 ✀  

stderr:
Vhost user backend fails to broadcast fake RARP
qemu-system-ppc64: -chardev 
socket,id=chr-reconnect,path=/tmp/vhost-test-Z5VMQ2/reconnect.sock,server=on: 
info: QEMU waiting for connection on: 
disconnected:unix:/tmp/vhost-test-Z5VMQ2/reconnect.sock,server=on
**
ERROR:../src/tests/qtest/vhost-user-test.c:255:wait_for_fds: assertion failed: 
(s->fds_num)
qemu-system-ppc64: Failed to set msg fds.
qemu-system-ppc64: vhost VQ 0 ring restore failed: -22: Invalid argument 
(22)
qemu-system-ppc64: Failed to set msg fds.
qemu-system-ppc64: vhost_set_vring_endian failed: Invalid argument (22)
qemu-system-ppc64: Failed to set msg fds.
qemu-system-ppc64: vhost VQ 1 ring restore failed: -22: Invalid argument 
(22)

qemu-system-ppc64: Failed to 

[PATCH v8 12/13] tests/qtest/vhost-user-blk-test: use memory-backend-shm

2024-06-18 Thread Stefano Garzarella
`memory-backend-memfd` is available only on Linux while the new
`memory-backend-shm` can be used on any POSIX-compliant operating
system. Let's use it so we can run the test in multiple environments.

Since we are here, let`s remove `share=on` which is the default for shm
(and also for memfd).

Acked-by: Thomas Huth 
Acked-by: Stefan Hajnoczi 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Reviewed-by: David Hildenbrand 
Signed-off-by: Stefano Garzarella 
---
v6
- removed `share=on` since it's the default [David]
---
 tests/qtest/vhost-user-blk-test.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qtest/vhost-user-blk-test.c 
b/tests/qtest/vhost-user-blk-test.c
index 117b9acd10..ea90d41232 100644
--- a/tests/qtest/vhost-user-blk-test.c
+++ b/tests/qtest/vhost-user-blk-test.c
@@ -906,7 +906,7 @@ static void start_vhost_user_blk(GString *cmd_line, int 
vus_instances,
vhost_user_blk_bin);
 
 g_string_append_printf(cmd_line,
-" -object memory-backend-memfd,id=mem,size=256M,share=on "
+" -object memory-backend-shm,id=mem,size=256M "
 " -M memory-backend=mem -m 256M ");
 
 for (i = 0; i < vus_instances; i++) {
-- 
2.45.2




[PATCH v8 08/13] vhost-user: enable frontends on any POSIX system

2024-06-18 Thread Stefano Garzarella
The vhost-user protocol is not really Linux-specific so let's enable
vhost-user frontends for any POSIX system.

In vhost_net.c we use VHOST_FILE_UNBIND which is defined in a Linux
specific header, let's define it for other systems as well.

Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Acked-by: Stefan Hajnoczi 
Reviewed-by: David Hildenbrand 
Signed-off-by: Stefano Garzarella 
---
 meson.build| 1 -
 hw/net/vhost_net.c | 5 +
 hw/block/Kconfig   | 2 +-
 3 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/meson.build b/meson.build
index 97e00d6f59..2ba95a8c35 100644
--- a/meson.build
+++ b/meson.build
@@ -151,7 +151,6 @@ have_tpm = get_option('tpm') \
 
 # vhost
 have_vhost_user = get_option('vhost_user') \
-  .disable_auto_if(host_os != 'linux') \
   .require(host_os != 'windows',
error_message: 'vhost-user is not available on Windows').allowed()
 have_vhost_vdpa = get_option('vhost_vdpa') \
diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index fd1a93701a..fced429813 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -34,8 +34,13 @@
 #include "standard-headers/linux/virtio_ring.h"
 #include "hw/virtio/vhost.h"
 #include "hw/virtio/virtio-bus.h"
+#if defined(__linux__)
 #include "linux-headers/linux/vhost.h"
+#endif
 
+#ifndef VHOST_FILE_UNBIND
+#define VHOST_FILE_UNBIND -1
+#endif
 
 /* Features supported by host kernel. */
 static const int kernel_feature_bits[] = {
diff --git a/hw/block/Kconfig b/hw/block/Kconfig
index 9e8f28f982..29ee09e434 100644
--- a/hw/block/Kconfig
+++ b/hw/block/Kconfig
@@ -40,7 +40,7 @@ config VHOST_USER_BLK
 bool
 # Only PCI devices are provided for now
 default y if VIRTIO_PCI
-depends on VIRTIO && VHOST_USER && LINUX
+depends on VIRTIO && VHOST_USER
 
 config SWIM
 bool
-- 
2.45.2




[PATCH v8 13/13] tests/qtest/vhost-user-test: add a test case for memory-backend-shm

2024-06-18 Thread Stefano Garzarella
`memory-backend-shm` can be used with vhost-user devices, so let's
add a new test case for it.

Acked-by: Thomas Huth 
Acked-by: Stefan Hajnoczi 
Reviewed-by: David Hildenbrand 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Stefano Garzarella 
---
 tests/qtest/vhost-user-test.c | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/tests/qtest/vhost-user-test.c b/tests/qtest/vhost-user-test.c
index d4e437265f..8c1d903b2a 100644
--- a/tests/qtest/vhost-user-test.c
+++ b/tests/qtest/vhost-user-test.c
@@ -44,6 +44,8 @@
 "mem-path=%s,share=on -numa node,memdev=mem"
 #define QEMU_CMD_MEMFD  " -m %d -object memory-backend-memfd,id=mem,size=%dM," 
\
 " -numa node,memdev=mem"
+#define QEMU_CMD_SHM" -m %d -object memory-backend-shm,id=mem,size=%dM," \
+" -numa node,memdev=mem"
 #define QEMU_CMD_CHR" -chardev socket,id=%s,path=%s%s"
 #define QEMU_CMD_NETDEV " -netdev vhost-user,id=hs0,chardev=%s,vhostforce=on"
 
@@ -195,6 +197,7 @@ enum test_memfd {
 TEST_MEMFD_AUTO,
 TEST_MEMFD_YES,
 TEST_MEMFD_NO,
+TEST_MEMFD_SHM,
 };
 
 static void append_vhost_net_opts(TestServer *s, GString *cmd_line,
@@ -228,6 +231,8 @@ static void append_mem_opts(TestServer *server, GString 
*cmd_line,
 
 if (memfd == TEST_MEMFD_YES) {
 g_string_append_printf(cmd_line, QEMU_CMD_MEMFD, size, size);
+} else if (memfd == TEST_MEMFD_SHM) {
+g_string_append_printf(cmd_line, QEMU_CMD_SHM, size, size);
 } else {
 const char *root = init_hugepagefs() ? : server->tmpfs;
 
@@ -788,6 +793,19 @@ static void *vhost_user_test_setup_memfd(GString 
*cmd_line, void *arg)
 return server;
 }
 
+static void *vhost_user_test_setup_shm(GString *cmd_line, void *arg)
+{
+TestServer *server = test_server_new("vhost-user-test", arg);
+test_server_listen(server);
+
+append_mem_opts(server, cmd_line, 256, TEST_MEMFD_SHM);
+server->vu_ops->append_opts(server, cmd_line, "");
+
+g_test_queue_destroy(vhost_user_test_cleanup, server);
+
+return server;
+}
+
 static void test_read_guest_mem(void *obj, void *arg, QGuestAllocator *alloc)
 {
 TestServer *server = arg;
@@ -1081,6 +1099,11 @@ static void register_vhost_user_test(void)
  "virtio-net",
  test_read_guest_mem, );
 
+opts.before = vhost_user_test_setup_shm;
+qos_add_test("vhost-user/read-guest-mem/shm",
+ "virtio-net",
+ test_read_guest_mem, );
+
 if (qemu_memfd_check(MFD_ALLOW_SEALING)) {
 opts.before = vhost_user_test_setup_memfd;
 qos_add_test("vhost-user/read-guest-mem/memfd",
-- 
2.45.2




[PATCH v8 11/13] hostmem: add a new memory backend based on POSIX shm_open()

2024-06-18 Thread Stefano Garzarella
shm_open() creates and opens a new POSIX shared memory object.
A POSIX shared memory object allows creating memory backend with an
associated file descriptor that can be shared with external processes
(e.g. vhost-user).

The new `memory-backend-shm` can be used as an alternative when
`memory-backend-memfd` is not available (Linux only), since shm_open()
should be provided by any POSIX-compliant operating system.

This backend mimics memfd, allocating memory that is practically
anonymous. In theory shm_open() requires a name, but this is allocated
for a short time interval and shm_unlink() is called right after
shm_open(). After that, only fd is shared with external processes
(e.g., vhost-user) as if it were associated with anonymous memory.

In the future we may also allow the user to specify the name to be
passed to shm_open(), but for now we keep the backend simple, mimicking
anonymous memory such as memfd.

Acked-by: David Hildenbrand 
Acked-by: Stefan Hajnoczi 
Acked-by: Markus Armbruster  (QAPI schema)
Signed-off-by: Stefano Garzarella 
---
v8
- Fixed QAPI documentation about share option [Markus]
v7
- changed default value documentation for @share rebasing on
  20240611130231.83152-1-sgarz...@redhat.com [Markus]
- used `memory-backend-shm` instead of `shm` and wrapped to 70 columns
  [Markus]
- added 'if': 'CONFIG_POSIX' to MemoryBackendShmProperties [Markus]
v5
- fixed documentation in qapi/qom.json and qemu-options.hx [Markus]
v4
- fail if we find "share=off" in shm_backend_memory_alloc() [David]
v3
- enriched commit message and documentation to highlight that we
  want to mimic memfd (David)
---
 docs/system/devices/vhost-user.rst |   5 +-
 qapi/qom.json  |  23 +-
 backends/hostmem-shm.c | 123 +
 backends/meson.build   |   1 +
 qemu-options.hx|  16 
 5 files changed, 164 insertions(+), 4 deletions(-)
 create mode 100644 backends/hostmem-shm.c

diff --git a/docs/system/devices/vhost-user.rst 
b/docs/system/devices/vhost-user.rst
index 9b2da106ce..35259d8ec7 100644
--- a/docs/system/devices/vhost-user.rst
+++ b/docs/system/devices/vhost-user.rst
@@ -98,8 +98,9 @@ Shared memory object
 
 In order for the daemon to access the VirtIO queues to process the
 requests it needs access to the guest's address space. This is
-achieved via the ``memory-backend-file`` or ``memory-backend-memfd``
-objects. A reference to a file-descriptor which can access this object
+achieved via the ``memory-backend-file``, ``memory-backend-memfd``, or
+``memory-backend-shm`` objects.
+A reference to a file-descriptor which can access this object
 will be passed via the socket as part of the protocol negotiation.
 
 Currently the shared memory object needs to match the size of the main
diff --git a/qapi/qom.json b/qapi/qom.json
index 9b8f6a7ab5..92b0fea76c 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -601,8 +601,8 @@
 #
 # @share: if false, the memory is private to QEMU; if true, it is
 # shared (default false for backends memory-backend-file and
-# memory-backend-ram, true for backends memory-backend-epc and
-# memory-backend-memfd)
+# memory-backend-ram, true for backends memory-backend-epc,
+# memory-backend-memfd, and memory-backend-shm)
 #
 # @reserve: if true, reserve swap space (or huge pages) if applicable
 # (default: true) (since 6.1)
@@ -721,6 +721,21 @@
 '*hugetlbsize': 'size',
 '*seal': 'bool' } }
 
+##
+# @MemoryBackendShmProperties:
+#
+# Properties for memory-backend-shm objects.
+#
+# This memory backend supports only shared memory, which is the
+# default.
+#
+# Since: 9.1
+##
+{ 'struct': 'MemoryBackendShmProperties',
+  'base': 'MemoryBackendProperties',
+  'data': { },
+  'if': 'CONFIG_POSIX' }
+
 ##
 # @MemoryBackendEpcProperties:
 #
@@ -1049,6 +1064,8 @@
 { 'name': 'memory-backend-memfd',
   'if': 'CONFIG_LINUX' },
 'memory-backend-ram',
+{ 'name': 'memory-backend-shm',
+  'if': 'CONFIG_POSIX' },
 'pef-guest',
 { 'name': 'pr-manager-helper',
   'if': 'CONFIG_LINUX' },
@@ -1121,6 +1138,8 @@
   'memory-backend-memfd':   { 'type': 'MemoryBackendMemfdProperties',
   'if': 'CONFIG_LINUX' },
   'memory-backend-ram': 'MemoryBackendProperties',
+  'memory-backend-shm': { 'type': 'MemoryBackendShmProperties',
+  'if': 'CONFIG_POSIX' },
   'pr-manager-helper':  { 'type': 'PrManagerHelperProperties',
   'if': 'CONFIG_LINUX' },
   'qtest':  'QtestProperties',
diff --git a/backends/hostmem-shm.c b/backends/hostmem-shm.c
new file mode 100644
index 00..374edc3db8
--- /dev/null
+++ b/backends/hostmem-shm.c
@@ -0,0 +1,123 @@
+/*
+ * QEMU host POSIX shared memory object backend
+ *
+ * Copyright (C) 2024 Red Hat Inc
+ *
+ * Authors:
+ *   Stefano 

[PATCH v8 10/13] contrib/vhost-user-blk: enable it on any POSIX system

2024-06-18 Thread Stefano Garzarella
Let's make the code more portable by adding defines from
block/file-posix.c to support O_DIRECT in other systems (e.g. macOS).

vhost-user-server.c is a dependency, let's enable it for any POSIX
system.

Acked-by: Stefan Hajnoczi 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Signed-off-by: Stefano Garzarella 
---
v6:
- reverted v5 changes since we can't move O_DSYNC and O_DIRECT in osdep
  [Daniel, failing tests on Windows]
v5:
- O_DSYNC and O_DIRECT definition are now in osdep [Phil]
- commit updated since we moved out all code changes
v4:
- moved using of "qemu/bswap.h" API in a separate patch [Phil]
---
 meson.build |  2 --
 contrib/vhost-user-blk/vhost-user-blk.c | 14 ++
 util/meson.build|  4 +++-
 3 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/meson.build b/meson.build
index 487153a431..f86ce92364 100644
--- a/meson.build
+++ b/meson.build
@@ -2001,8 +2001,6 @@ has_statx = cc.has_header_symbol('sys/stat.h', 
'STATX_BASIC_STATS', prefix: gnu_
 has_statx_mnt_id = cc.has_header_symbol('sys/stat.h', 'STATX_MNT_ID', prefix: 
gnu_source_prefix)
 
 have_vhost_user_blk_server = get_option('vhost_user_blk_server') \
-  .require(host_os == 'linux',
-   error_message: 'vhost_user_blk_server requires linux') \
   .require(have_vhost_user,
error_message: 'vhost_user_blk_server requires vhost-user support') 
\
   .disable_auto_if(not have_tools and not have_system) \
diff --git a/contrib/vhost-user-blk/vhost-user-blk.c 
b/contrib/vhost-user-blk/vhost-user-blk.c
index 9492146855..a450337685 100644
--- a/contrib/vhost-user-blk/vhost-user-blk.c
+++ b/contrib/vhost-user-blk/vhost-user-blk.c
@@ -25,6 +25,20 @@
 #include 
 #endif
 
+/* OS X does not have O_DSYNC */
+#ifndef O_DSYNC
+#ifdef O_SYNC
+#define O_DSYNC O_SYNC
+#elif defined(O_FSYNC)
+#define O_DSYNC O_FSYNC
+#endif
+#endif
+
+/* Approximate O_DIRECT with O_DSYNC if O_DIRECT isn't available */
+#ifndef O_DIRECT
+#define O_DIRECT O_DSYNC
+#endif
+
 enum {
 VHOST_USER_BLK_MAX_QUEUES = 8,
 };
diff --git a/util/meson.build b/util/meson.build
index 72b505df11..c414178ace 100644
--- a/util/meson.build
+++ b/util/meson.build
@@ -112,10 +112,12 @@ if have_block
 util_ss.add(files('filemonitor-stub.c'))
   endif
   if host_os == 'linux'
-util_ss.add(files('vhost-user-server.c'), vhost_user)
 util_ss.add(files('vfio-helpers.c'))
 util_ss.add(files('chardev_open.c'))
   endif
+  if host_os != 'windows'
+util_ss.add(files('vhost-user-server.c'), vhost_user)
+  endif
   util_ss.add(files('yank.c'))
 endif
 
-- 
2.45.2




[PATCH v8 09/13] libvhost-user: enable it on any POSIX system

2024-06-18 Thread Stefano Garzarella
The vhost-user protocol is not really Linux-specific so let's enable
libvhost-user for any POSIX system.

Compiling it on macOS and FreeBSD some problems came up:
- avoid to include linux/vhost.h which is available only on Linux
  (vhost_types.h contains many of the things we need)
- macOS doesn't provide sys/endian.h, so let's define them
  (note: libvhost-user doesn't include QEMU's headers, so we can't use
   use "qemu/bswap.h")
- define eventfd_[write|read] as write/read wrapper when system doesn't
  provide those (e.g. macOS)
- copy SEAL defines from include/qemu/memfd.h to make the code works
  on FreeBSD where MFD_ALLOW_SEALING is defined
- define MAP_NORESERVE if it's not defined (e.g. on FreeBSD)

Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Acked-by: Stefan Hajnoczi 
Signed-off-by: Stefano Garzarella 
---
v5:
- fixed typos in the commit description [Phil]
---
 meson.build   |  2 +-
 subprojects/libvhost-user/libvhost-user.h |  2 +-
 subprojects/libvhost-user/libvhost-user.c | 60 +--
 3 files changed, 59 insertions(+), 5 deletions(-)

diff --git a/meson.build b/meson.build
index 2ba95a8c35..487153a431 100644
--- a/meson.build
+++ b/meson.build
@@ -3190,7 +3190,7 @@ if have_system and vfio_user_server_allowed
 endif
 
 vhost_user = not_found
-if host_os == 'linux' and have_vhost_user
+if have_vhost_user
   libvhost_user = subproject('libvhost-user')
   vhost_user = libvhost_user.get_variable('vhost_user_dep')
 endif
diff --git a/subprojects/libvhost-user/libvhost-user.h 
b/subprojects/libvhost-user/libvhost-user.h
index deb40e77b3..e13e1d3931 100644
--- a/subprojects/libvhost-user/libvhost-user.h
+++ b/subprojects/libvhost-user/libvhost-user.h
@@ -18,9 +18,9 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include "standard-headers/linux/virtio_ring.h"
+#include "standard-headers/linux/vhost_types.h"
 
 /* Based on qemu/hw/virtio/vhost-user.c */
 #define VHOST_USER_F_PROTOCOL_FEATURES 30
diff --git a/subprojects/libvhost-user/libvhost-user.c 
b/subprojects/libvhost-user/libvhost-user.c
index 2c20cdc16e..57e58d4adb 100644
--- a/subprojects/libvhost-user/libvhost-user.c
+++ b/subprojects/libvhost-user/libvhost-user.c
@@ -28,9 +28,7 @@
 #include 
 #include 
 #include 
-#include 
 #include 
-#include 
 
 /* Necessary to provide VIRTIO_F_VERSION_1 on system
  * with older linux headers. Must appear before
@@ -39,8 +37,8 @@
 #include "standard-headers/linux/virtio_config.h"
 
 #if defined(__linux__)
+#include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -52,6 +50,62 @@
 
 #endif
 
+#if defined(__APPLE__) && (__MACH__)
+#include 
+#define htobe16(x) OSSwapHostToBigInt16(x)
+#define htole16(x) OSSwapHostToLittleInt16(x)
+#define be16toh(x) OSSwapBigToHostInt16(x)
+#define le16toh(x) OSSwapLittleToHostInt16(x)
+
+#define htobe32(x) OSSwapHostToBigInt32(x)
+#define htole32(x) OSSwapHostToLittleInt32(x)
+#define be32toh(x) OSSwapBigToHostInt32(x)
+#define le32toh(x) OSSwapLittleToHostInt32(x)
+
+#define htobe64(x) OSSwapHostToBigInt64(x)
+#define htole64(x) OSSwapHostToLittleInt64(x)
+#define be64toh(x) OSSwapBigToHostInt64(x)
+#define le64toh(x) OSSwapLittleToHostInt64(x)
+#endif
+
+#ifdef CONFIG_EVENTFD
+#include 
+#else
+#define eventfd_t uint64_t
+
+int eventfd_write(int fd, eventfd_t value)
+{
+return (write(fd, , sizeof(value)) == sizeof(value)) ? 0 : -1;
+}
+
+int eventfd_read(int fd, eventfd_t *value)
+{
+return (read(fd, value, sizeof(*value)) == sizeof(*value)) ? 0 : -1;
+}
+#endif
+
+#ifdef MFD_ALLOW_SEALING
+#include 
+
+#ifndef F_LINUX_SPECIFIC_BASE
+#define F_LINUX_SPECIFIC_BASE 1024
+#endif
+
+#ifndef F_ADD_SEALS
+#define F_ADD_SEALS (F_LINUX_SPECIFIC_BASE + 9)
+#define F_GET_SEALS (F_LINUX_SPECIFIC_BASE + 10)
+
+#define F_SEAL_SEAL 0x0001  /* prevent further seals from being set */
+#define F_SEAL_SHRINK   0x0002  /* prevent file from shrinking */
+#define F_SEAL_GROW 0x0004  /* prevent file from growing */
+#define F_SEAL_WRITE0x0008  /* prevent writes */
+#endif
+#endif
+
+#ifndef MAP_NORESERVE
+#define MAP_NORESERVE 0
+#endif
+
 #include "include/atomic.h"
 
 #include "libvhost-user.h"
-- 
2.45.2




[PATCH v8 07/13] contrib/vhost-user-*: use QEMU bswap helper functions

2024-06-18 Thread Stefano Garzarella
Let's replace the calls to le*toh() and htole*() with qemu/bswap.h
helpers to make the code more portable.

Suggested-by: Philippe Mathieu-Daudé 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Acked-by: Stefan Hajnoczi 
Reviewed-by: David Hildenbrand 
Signed-off-by: Stefano Garzarella 
---
 contrib/vhost-user-blk/vhost-user-blk.c |  9 +
 contrib/vhost-user-input/main.c | 16 
 2 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/contrib/vhost-user-blk/vhost-user-blk.c 
b/contrib/vhost-user-blk/vhost-user-blk.c
index a8ab9269a2..9492146855 100644
--- a/contrib/vhost-user-blk/vhost-user-blk.c
+++ b/contrib/vhost-user-blk/vhost-user-blk.c
@@ -16,6 +16,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/bswap.h"
 #include "standard-headers/linux/virtio_blk.h"
 #include "libvhost-user-glib.h"
 
@@ -194,8 +195,8 @@ vub_discard_write_zeroes(VubReq *req, struct iovec *iov, 
uint32_t iovcnt,
 #if defined(__linux__) && defined(BLKDISCARD) && defined(BLKZEROOUT)
 VubDev *vdev_blk = req->vdev_blk;
 desc = buf;
-uint64_t range[2] = { le64toh(desc->sector) << 9,
-  le32toh(desc->num_sectors) << 9 };
+uint64_t range[2] = { le64_to_cpu(desc->sector) << 9,
+  le32_to_cpu(desc->num_sectors) << 9 };
 if (type == VIRTIO_BLK_T_DISCARD) {
 if (ioctl(vdev_blk->blk_fd, BLKDISCARD, range) == 0) {
 g_free(buf);
@@ -267,13 +268,13 @@ static int vub_virtio_process_req(VubDev *vdev_blk,
 req->in = (struct virtio_blk_inhdr *)elem->in_sg[in_num - 1].iov_base;
 in_num--;
 
-type = le32toh(req->out->type);
+type = le32_to_cpu(req->out->type);
 switch (type & ~VIRTIO_BLK_T_BARRIER) {
 case VIRTIO_BLK_T_IN:
 case VIRTIO_BLK_T_OUT: {
 ssize_t ret = 0;
 bool is_write = type & VIRTIO_BLK_T_OUT;
-req->sector_num = le64toh(req->out->sector);
+req->sector_num = le64_to_cpu(req->out->sector);
 if (is_write) {
 ret  = vub_writev(req, >out_sg[1], out_num);
 } else {
diff --git a/contrib/vhost-user-input/main.c b/contrib/vhost-user-input/main.c
index 081230da54..f3362d41ac 100644
--- a/contrib/vhost-user-input/main.c
+++ b/contrib/vhost-user-input/main.c
@@ -51,8 +51,8 @@ static void vi_input_send(VuInput *vi, struct 
virtio_input_event *event)
 vi->queue[vi->qindex++].event = *event;
 
 /* ... until we see a report sync ... */
-if (event->type != htole16(EV_SYN) ||
-event->code != htole16(SYN_REPORT)) {
+if (event->type != cpu_to_le16(EV_SYN) ||
+event->code != cpu_to_le16(SYN_REPORT)) {
 return;
 }
 
@@ -103,9 +103,9 @@ vi_evdev_watch(VuDev *dev, int condition, void *data)
 
 g_debug("input %d %d %d", evdev.type, evdev.code, evdev.value);
 
-virtio.type  = htole16(evdev.type);
-virtio.code  = htole16(evdev.code);
-virtio.value = htole32(evdev.value);
+virtio.type  = cpu_to_le16(evdev.type);
+virtio.code  = cpu_to_le16(evdev.code);
+virtio.value = cpu_to_le32(evdev.value);
 vi_input_send(vi, );
 }
 }
@@ -124,9 +124,9 @@ static void vi_handle_status(VuInput *vi, 
virtio_input_event *event)
 
 evdev.input_event_sec = tval.tv_sec;
 evdev.input_event_usec = tval.tv_usec;
-evdev.type = le16toh(event->type);
-evdev.code = le16toh(event->code);
-evdev.value = le32toh(event->value);
+evdev.type = le16_to_cpu(event->type);
+evdev.code = le16_to_cpu(event->code);
+evdev.value = le32_to_cpu(event->value);
 
 rc = write(vi->evdevfd, , sizeof(evdev));
 if (rc == -1) {
-- 
2.45.2




[PATCH v8 06/13] contrib/vhost-user-blk: fix bind() using the right size of the address

2024-06-18 Thread Stefano Garzarella
On macOS passing `-s /tmp/vhost.socket` parameter to the vhost-user-blk
application, the bind was done on `/tmp/vhost.socke` pathname,
missing the last character.

This sounds like one of the portability problems described in the
unix(7) manpage:

   Pathname sockets
   When  binding  a socket to a pathname, a few rules should
   be observed for maximum portability and ease of coding:

   •  The pathname in sun_path should be null-terminated.

   •  The length of the pathname, including the  terminating
  null byte, should not exceed the size of sun_path.

   •  The  addrlen  argument  that  describes  the enclosing
  sockaddr_un structure should have a value of at least:

  offsetof(struct sockaddr_un, sun_path) +
  strlen(addr.sun_path)+1

  or,  more  simply,  addrlen  can   be   specified   as
  sizeof(struct sockaddr_un).

So let's follow the last advice and simplify the code as well.

Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Acked-by: Stefan Hajnoczi 
Reviewed-by: David Hildenbrand 
Signed-off-by: Stefano Garzarella 
---
 contrib/vhost-user-blk/vhost-user-blk.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/contrib/vhost-user-blk/vhost-user-blk.c 
b/contrib/vhost-user-blk/vhost-user-blk.c
index 89e5f11a64..a8ab9269a2 100644
--- a/contrib/vhost-user-blk/vhost-user-blk.c
+++ b/contrib/vhost-user-blk/vhost-user-blk.c
@@ -469,7 +469,6 @@ static int unix_sock_new(char *unix_fn)
 {
 int sock;
 struct sockaddr_un un;
-size_t len;
 
 assert(unix_fn);
 
@@ -481,10 +480,9 @@ static int unix_sock_new(char *unix_fn)
 
 un.sun_family = AF_UNIX;
 (void)snprintf(un.sun_path, sizeof(un.sun_path), "%s", unix_fn);
-len = sizeof(un.sun_family) + strlen(un.sun_path);
 
 (void)unlink(unix_fn);
-if (bind(sock, (struct sockaddr *), len) < 0) {
+if (bind(sock, (struct sockaddr *), sizeof(un)) < 0) {
 perror("bind");
 goto fail;
 }
-- 
2.45.2




[PATCH v8 01/13] qapi: clarify that the default is backend dependent

2024-06-18 Thread Stefano Garzarella
The default value of the @share option of the @MemoryBackendProperties
really depends on the backend type, so let's document the default
values in the same place where we define the option to avoid
dispersing the information.

Cc: David Hildenbrand 
Suggested-by: Markus Armbruster 
Reviewed-by: Markus Armbruster 
Signed-off-by: Stefano Garzarella 
---
v2: https://patchew.org/QEMU/20240611130231.83152-1-sgarz...@redhat.com/
v1: https://patchew.org/QEMU/20240523133302.103858-1-sgarz...@redhat.com/
---
 qapi/qom.json | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/qapi/qom.json b/qapi/qom.json
index 8bd299265e..9b8f6a7ab5 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -600,7 +600,9 @@
 # preallocation threads (default: none) (since 7.2)
 #
 # @share: if false, the memory is private to QEMU; if true, it is
-# shared (default: false)
+# shared (default false for backends memory-backend-file and
+# memory-backend-ram, true for backends memory-backend-epc and
+# memory-backend-memfd)
 #
 # @reserve: if true, reserve swap space (or huge pages) if applicable
 # (default: true) (since 6.1)
@@ -700,8 +702,6 @@
 #
 # Properties for memory-backend-memfd objects.
 #
-# The @share boolean option is true by default with memfd.
-#
 # @hugetlb: if true, the file to be created resides in the hugetlbfs
 # filesystem (default: false)
 #
@@ -726,8 +726,6 @@
 #
 # Properties for memory-backend-epc objects.
 #
-# The @share boolean option is true by default with epc
-#
 # The @merge boolean option is false by default with epc
 #
 # The @dump boolean option is false by default with epc
-- 
2.45.2




[PATCH v8 04/13] libvhost-user: mask F_INFLIGHT_SHMFD if memfd is not supported

2024-06-18 Thread Stefano Garzarella
libvhost-user will panic when receiving VHOST_USER_GET_INFLIGHT_FD
message if MFD_ALLOW_SEALING is not defined, since it's not able
to create a memfd.

VHOST_USER_GET_INFLIGHT_FD is used only if
VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD is negotiated. So, let's mask
that feature if the backend is not able to properly handle these
messages.

Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Acked-by: Stefan Hajnoczi 
Reviewed-by: David Hildenbrand 
Signed-off-by: Stefano Garzarella 
---
 subprojects/libvhost-user/libvhost-user.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/subprojects/libvhost-user/libvhost-user.c 
b/subprojects/libvhost-user/libvhost-user.c
index a11afd1960..2c20cdc16e 100644
--- a/subprojects/libvhost-user/libvhost-user.c
+++ b/subprojects/libvhost-user/libvhost-user.c
@@ -1674,6 +1674,17 @@ vu_get_protocol_features_exec(VuDev *dev, VhostUserMsg 
*vmsg)
 features |= dev->iface->get_protocol_features(dev);
 }
 
+#ifndef MFD_ALLOW_SEALING
+/*
+ * If MFD_ALLOW_SEALING is not defined, we are not able to handle
+ * VHOST_USER_GET_INFLIGHT_FD messages, since we can't create a memfd.
+ * Those messages are used only if VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD
+ * is negotiated. A device implementation can enable it, so let's mask
+ * it to avoid a runtime panic.
+ */
+features &= ~(1ULL << VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD);
+#endif
+
 vmsg_set_reply_u64(vmsg, features);
 return true;
 }
-- 
2.45.2




[PATCH v8 02/13] libvhost-user: set msg.msg_control to NULL when it is empty

2024-06-18 Thread Stefano Garzarella
On some OS (e.g. macOS) sendmsg() returns -1 (errno EINVAL) if
the `struct msghdr` has the field `msg_controllen` set to 0, but
`msg_control` is not NULL.

Reviewed-by: Eric Blake 
Reviewed-by: David Hildenbrand 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Acked-by: Stefan Hajnoczi 
Signed-off-by: Stefano Garzarella 
---
 subprojects/libvhost-user/libvhost-user.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/subprojects/libvhost-user/libvhost-user.c 
b/subprojects/libvhost-user/libvhost-user.c
index a879149fef..22bea0c775 100644
--- a/subprojects/libvhost-user/libvhost-user.c
+++ b/subprojects/libvhost-user/libvhost-user.c
@@ -632,6 +632,7 @@ vu_message_write(VuDev *dev, int conn_fd, VhostUserMsg 
*vmsg)
 memcpy(CMSG_DATA(cmsg), vmsg->fds, fdsize);
 } else {
 msg.msg_controllen = 0;
+msg.msg_control = NULL;
 }
 
 do {
-- 
2.45.2




[PATCH v8 05/13] vhost-user-server: do not set memory fd non-blocking

2024-06-18 Thread Stefano Garzarella
In vhost-user-server we set all fd received from the other peer
in non-blocking mode. For some of them (e.g. memfd, shm_open, etc.)
it's not really needed, because we don't use these fd with blocking
operations, but only to map memory.

In addition, in some systems this operation can fail (e.g. in macOS
setting an fd returned by shm_open() non-blocking fails with errno
= ENOTTY).

So, let's avoid setting fd non-blocking for those messages that we
know carry memory fd (e.g. VHOST_USER_ADD_MEM_REG,
VHOST_USER_SET_MEM_TABLE).

Reviewed-by: Daniel P. Berrangé 
Acked-by: Stefan Hajnoczi 
Reviewed-by: David Hildenbrand 
Signed-off-by: Stefano Garzarella 
---
v3:
- avoiding setting fd non-blocking for messages where we have memory fd
  (Eric)
---
 util/vhost-user-server.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
index 3bfb1ad3ec..b19229074a 100644
--- a/util/vhost-user-server.c
+++ b/util/vhost-user-server.c
@@ -65,6 +65,18 @@ static void vmsg_close_fds(VhostUserMsg *vmsg)
 static void vmsg_unblock_fds(VhostUserMsg *vmsg)
 {
 int i;
+
+/*
+ * These messages carry fd used to map memory, not to send/receive 
messages,
+ * so this operation is useless. In addition, in some systems this
+ * operation can fail (e.g. in macOS setting an fd returned by shm_open()
+ * non-blocking fails with errno = ENOTTY)
+ */
+if (vmsg->request == VHOST_USER_ADD_MEM_REG ||
+vmsg->request == VHOST_USER_SET_MEM_TABLE) {
+return;
+}
+
 for (i = 0; i < vmsg->fd_num; i++) {
 qemu_socket_set_nonblock(vmsg->fds[i]);
 }
-- 
2.45.2




[PATCH v8 03/13] libvhost-user: fail vu_message_write() if sendmsg() is failing

2024-06-18 Thread Stefano Garzarella
In vu_message_write() we use sendmsg() to send the message header,
then a write() to send the payload.

If sendmsg() fails we should avoid sending the payload, since we
were unable to send the header.

Discovered before fixing the issue with the previous patch, where
sendmsg() failed on macOS due to wrong parameters, but the frontend
still sent the payload which the backend incorrectly interpreted
as a wrong header.

Reviewed-by: Daniel P. Berrangé 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Acked-by: Stefan Hajnoczi 
Reviewed-by: David Hildenbrand 
Signed-off-by: Stefano Garzarella 
---
 subprojects/libvhost-user/libvhost-user.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/subprojects/libvhost-user/libvhost-user.c 
b/subprojects/libvhost-user/libvhost-user.c
index 22bea0c775..a11afd1960 100644
--- a/subprojects/libvhost-user/libvhost-user.c
+++ b/subprojects/libvhost-user/libvhost-user.c
@@ -639,6 +639,11 @@ vu_message_write(VuDev *dev, int conn_fd, VhostUserMsg 
*vmsg)
 rc = sendmsg(conn_fd, , 0);
 } while (rc < 0 && (errno == EINTR || errno == EAGAIN));
 
+if (rc <= 0) {
+vu_panic(dev, "Error while writing: %s", strerror(errno));
+return false;
+}
+
 if (vmsg->size) {
 do {
 if (vmsg->data) {
-- 
2.45.2




[PATCH v8 00/13] vhost-user: support any POSIX system (tested on macOS, FreeBSD, OpenBSD)

2024-06-18 Thread Stefano Garzarella
As discussed with Michael and Markus [1], this version also includes the patch
on which v7 depended to simplify the merge in Michael's tree.

The series is all reviewed, so if there are no new changes required, I would
ask to merge it.

[1] 
https://patchew.org/QEMU/20240612130140.63004-1-sgarz...@redhat.com/#vabzv4z6g3dd5yndvpmwktcfgbqrdg7qk2e5se6zuflrhss723@dws4vrzen6cs

Thanks,
Stefano

Changelog

v1: https://patchew.org/QEMU/20240228114759.44758-1-sgarz...@redhat.com/
v2: https://patchew.org/QEMU/20240326133936.125332-1-sgarz...@redhat.com/
v3: https://patchew.org/QEMU/20240404122330.92710-1-sgarz...@redhat.com/
v4: https://patchew.org/QEMU/20240508074457.12367-1-sgarz...@redhat.com/
v5: https://patchew.org/QEMU/20240523145522.313012-1-sgarz...@redhat.com/
v6: https://patchew.org/QEMU/20240528103543.145412-1-sgarz...@redhat.com/
v7: https://patchew.org/QEMU/20240612130140.63004-1-sgarz...@redhat.com/
v8:
- Included the dependent patch in this series
  https://patchew.org/QEMU/20240611130231.83152-1-sgarz...@redhat.com/
- Fixed QAPI documentation about share option [Markus]

Description

The vhost-user protocol is not really Linux-specific, so let's try support
QEMU's frontends and backends (including libvhost-user) in any POSIX system
with this series. The main use case is to be able to use virtio devices that
we don't have built-in in QEMU (e.g. virtiofsd, vhost-user-vsock, etc.) even
in non-Linux systems.

The first 5 patches are more like fixes discovered at runtime on macOS or
FreeBSD that could go even independently of this series.

Patches 6, 7, 8, 9 enable building of frontends and backends (including
libvhost-user) with associated code changes to succeed in compilation.

Patch 10 adds `memory-backend-shm` that uses the POSIX shm_open() API to
create shared memory which is identified by an fd that can be shared with
vhost-user backends. This is useful on those systems (like macOS) where
we don't have memfd_create() or special filesystems like "/dev/shm".

Patches 11 and 12 use `memory-backend-shm` in some vhost-user tests.

Maybe the first 5 patches can go separately, but I only discovered those
problems after testing patches 6 - 9, so I have included them in this series
for now. Please let me know if you prefer that I send them separately.

I tested this series using vhost-user-blk and QSD on macOS Sonoma 14.4
(aarch64), FreeBSD 14 (x86_64), OpenBSD 7.4 (x86_64), and Fedora 40 (x86_64)
in this way:

- Start vhost-user-blk or QSD (same commands for all systems)

  vhost-user-blk -s /tmp/vhost.socket \
-b Fedora-Cloud-Base-39-1.5.x86_64.raw

  qemu-storage-daemon \
--blockdev 
file,filename=Fedora-Cloud-Base-39-1.5.x86_64.qcow2,node-name=file \
--blockdev qcow2,file=file,node-name=qcow2 \
--export 
vhost-user-blk,addr.type=unix,addr.path=/tmp/vhost.socket,id=vub,num-queues=1,node-name=qcow2,writable=on

- macOS (aarch64): start QEMU (using hvf accelerator)

  qemu-system-aarch64 -smp 2 -cpu host -M virt,accel=hvf,memory-backend=mem \
-drive 
file=./build/pc-bios/edk2-aarch64-code.fd,if=pflash,format=raw,readonly=on \
-device virtio-net-device,netdev=net0 -netdev user,id=net0 \
-device ramfb -device usb-ehci -device usb-kbd \
-object memory-backend-shm,id=mem,size=512M \
-device vhost-user-blk-pci,num-queues=1,disable-legacy=on,chardev=char0 \
-chardev socket,id=char0,path=/tmp/vhost.socket

- FreeBSD/OpenBSD (x86_64): start QEMU (no accelerators available)

  qemu-system-x86_64 -smp 2 -M q35,memory-backend=mem \
-object memory-backend-shm,id=mem,size="512M" \
-device vhost-user-blk-pci,num-queues=1,chardev=char0 \
-chardev socket,id=char0,path=/tmp/vhost.socket

- Fedora (x86_64): start QEMU (using kvm accelerator)

  qemu-system-x86_64 -smp 2 -M q35,accel=kvm,memory-backend=mem \
-object memory-backend-shm,size="512M" \
-device vhost-user-blk-pci,num-queues=1,chardev=char0 \
-chardev socket,id=char0,path=/tmp/vhost.socket

Branch pushed (and CI started) at 
https://gitlab.com/sgarzarella/qemu/-/tree/macos-vhost-user?ref_type=heads

Stefano Garzarella (13):
  qapi: clarify that the default is backend dependent
  libvhost-user: set msg.msg_control to NULL when it is empty
  libvhost-user: fail vu_message_write() if sendmsg() is failing
  libvhost-user: mask F_INFLIGHT_SHMFD if memfd is not supported
  vhost-user-server: do not set memory fd non-blocking
  contrib/vhost-user-blk: fix bind() using the right size of the address
  contrib/vhost-user-*: use QEMU bswap helper functions
  vhost-user: enable frontends on any POSIX system
  libvhost-user: enable it on any POSIX system
  contrib/vhost-user-blk: enable it on any POSIX system
  hostmem: add a new memory backend based on POSIX shm_open()
  tests/qtest/vhost-user-blk-test: use memory-backend-shm
  tests/qtest/vhost-user-test: add a test case for memory-backend-shm

 docs/system/devices/vhost-user.rst|   5 +-
 meson.build   

Re: [PATCH RESEND v7 00/12] vhost-user: support any POSIX system (tested on macOS, FreeBSD, OpenBSD)

2024-06-17 Thread Stefano Garzarella

On Mon, Jun 17, 2024 at 09:42:21AM GMT, Michael S. Tsirkin wrote:

On Mon, Jun 17, 2024 at 02:59:14PM +0200, Stefano Garzarella wrote:

On Mon, Jun 17, 2024 at 02:02:30PM GMT, Markus Armbruster wrote:
> Stefano Garzarella  writes:
>
> > Hi Michael,
> >
> > On Wed, Jun 12, 2024 at 03:01:28PM GMT, Stefano Garzarella wrote:
> > > This series should be in a good shape, in which tree should we queue it?
> > > @Micheal would your tree be okay?
> >
> > Markus suggested a small change to patch 10, so do you want me to resend 
the whole series, or is it okay to resend just the last 3 patches (which are also the 
ones that depend on the other patch queued by Markus)?
>
> I guess you mean
>
>[PATCH v2] qapi: clarify that the default is backend dependent
>Message-ID: <20240611130231.83152-1-sgarz...@redhat.com>

Yep!

>
> > In the last case I would ask you to queue up the first 9 patches of this 
series if that is okay with you.
>
> Michael, feel free to merge the patch I queued.
>

I can also include it in v8 if it helps.

Thanks,
Stefano



If I'm to merge it, pls do.
Much less error prone.


Okay, I'll include it in v8.
I'll wait until tomorrow to see if there's any objection on the tree, 
but I think yours is the most suitable.


Thanks,
Stefano




Re: [PATCH RESEND v7 00/12] vhost-user: support any POSIX system (tested on macOS, FreeBSD, OpenBSD)

2024-06-17 Thread Stefano Garzarella

On Mon, Jun 17, 2024 at 02:02:30PM GMT, Markus Armbruster wrote:

Stefano Garzarella  writes:


Hi Michael,

On Wed, Jun 12, 2024 at 03:01:28PM GMT, Stefano Garzarella wrote:

This series should be in a good shape, in which tree should we queue it?
@Micheal would your tree be okay?


Markus suggested a small change to patch 10, so do you want me to resend the 
whole series, or is it okay to resend just the last 3 patches (which are also 
the ones that depend on the other patch queued by Markus)?


I guess you mean

   [PATCH v2] qapi: clarify that the default is backend dependent
   Message-ID: <20240611130231.83152-1-sgarz...@redhat.com>


Yep!




In the last case I would ask you to queue up the first 9 patches of this series 
if that is okay with you.


Michael, feel free to merge the patch I queued.



I can also include it in v8 if it helps.

Thanks,
Stefano




Re: [PATCH RESEND v7 00/12] vhost-user: support any POSIX system (tested on macOS, FreeBSD, OpenBSD)

2024-06-17 Thread Stefano Garzarella

Hi Michael,

On Wed, Jun 12, 2024 at 03:01:28PM GMT, Stefano Garzarella wrote:

This series should be in a good shape, in which tree should we queue it?
@Micheal would your tree be okay?


Markus suggested a small change to patch 10, so do you want me to resend 
the whole series, or is it okay to resend just the last 3 patches (which 
are also the ones that depend on the other patch queued by Markus)?


In the last case I would ask you to queue up the first 9 patches of this 
series if that is okay with you.


Thanks,
Stefano



Thanks,
Stefano

Changelog

v1: https://patchew.org/QEMU/20240228114759.44758-1-sgarz...@redhat.com/
v2: https://patchew.org/QEMU/20240326133936.125332-1-sgarz...@redhat.com/
v3: https://patchew.org/QEMU/20240404122330.92710-1-sgarz...@redhat.com/
v4: https://patchew.org/QEMU/20240508074457.12367-1-sgarz...@redhat.com/
v5: https://patchew.org/QEMU/20240523145522.313012-1-sgarz...@redhat.com/
v6: https://patchew.org/QEMU/20240528103543.145412-1-sgarz...@redhat.com/
v7:
- rebased on 
https://patchew.org/QEMU/20240611130231.83152-1-sgarz...@redhat.com/
 That patch is queued by Markus and only Patch 10 of this series depends on it.
- changed default value documentation for @share [Markus]
- used `memory-backend-shm` instead of `shm` and wrapped to 70 columns
 [Markus]
- added 'if': 'CONFIG_POSIX' to MemoryBackendShmProperties [Markus]

Description

The vhost-user protocol is not really Linux-specific, so let's try support
QEMU's frontends and backends (including libvhost-user) in any POSIX system
with this series. The main use case is to be able to use virtio devices that
we don't have built-in in QEMU (e.g. virtiofsd, vhost-user-vsock, etc.) even
in non-Linux systems.

The first 5 patches are more like fixes discovered at runtime on macOS or
FreeBSD that could go even independently of this series.

Patches 6, 7, 8, 9 enable building of frontends and backends (including
libvhost-user) with associated code changes to succeed in compilation.

Patch 10 adds `memory-backend-shm` that uses the POSIX shm_open() API to
create shared memory which is identified by an fd that can be shared with
vhost-user backends. This is useful on those systems (like macOS) where
we don't have memfd_create() or special filesystems like "/dev/shm".

Patches 11 and 12 use `memory-backend-shm` in some vhost-user tests.

Maybe the first 5 patches can go separately, but I only discovered those
problems after testing patches 6 - 9, so I have included them in this series
for now. Please let me know if you prefer that I send them separately.

I tested this series using vhost-user-blk and QSD on macOS Sonoma 14.4
(aarch64), FreeBSD 14 (x86_64), OpenBSD 7.4 (x86_64), and Fedora 40 (x86_64)
in this way:

- Start vhost-user-blk or QSD (same commands for all systems)

 vhost-user-blk -s /tmp/vhost.socket \
   -b Fedora-Cloud-Base-39-1.5.x86_64.raw

 qemu-storage-daemon \
   --blockdev 
file,filename=Fedora-Cloud-Base-39-1.5.x86_64.qcow2,node-name=file \
   --blockdev qcow2,file=file,node-name=qcow2 \
   --export 
vhost-user-blk,addr.type=unix,addr.path=/tmp/vhost.socket,id=vub,num-queues=1,node-name=qcow2,writable=on

- macOS (aarch64): start QEMU (using hvf accelerator)

 qemu-system-aarch64 -smp 2 -cpu host -M virt,accel=hvf,memory-backend=mem \
   -drive 
file=./build/pc-bios/edk2-aarch64-code.fd,if=pflash,format=raw,readonly=on \
   -device virtio-net-device,netdev=net0 -netdev user,id=net0 \
   -device ramfb -device usb-ehci -device usb-kbd \
   -object memory-backend-shm,id=mem,size=512M \
   -device vhost-user-blk-pci,num-queues=1,disable-legacy=on,chardev=char0 \
   -chardev socket,id=char0,path=/tmp/vhost.socket

- FreeBSD/OpenBSD (x86_64): start QEMU (no accelerators available)

 qemu-system-x86_64 -smp 2 -M q35,memory-backend=mem \
   -object memory-backend-shm,id=mem,size="512M" \
   -device vhost-user-blk-pci,num-queues=1,chardev=char0 \
   -chardev socket,id=char0,path=/tmp/vhost.socket

- Fedora (x86_64): start QEMU (using kvm accelerator)

 qemu-system-x86_64 -smp 2 -M q35,accel=kvm,memory-backend=mem \
   -object memory-backend-shm,size="512M" \
   -device vhost-user-blk-pci,num-queues=1,chardev=char0 \
   -chardev socket,id=char0,path=/tmp/vhost.socket

Branch pushed (and CI started) at 
https://gitlab.com/sgarzarella/qemu/-/tree/macos-vhost-user?ref_type=heads

Based-on: 20240611130231.83152-1-sgarz...@redhat.com

Stefano Garzarella (12):
 libvhost-user: set msg.msg_control to NULL when it is empty
 libvhost-user: fail vu_message_write() if sendmsg() is failing
 libvhost-user: mask F_INFLIGHT_SHMFD if memfd is not supported
 vhost-user-server: do not set memory fd non-blocking
 contrib/vhost-user-blk: fix bind() using the right size of the address
 contrib/vhost-user-*: use QEMU bswap helper functions
 vhost-user: enable frontends on any POSIX system
 libvhost-user: enable it on any POSIX system
 contrib/vhost-user-blk: enable it on any POSIX system
 hostmem: add a new

Re: [PATCH RESEND v7 10/12] hostmem: add a new memory backend based on POSIX shm_open()

2024-06-13 Thread Stefano Garzarella

On Wed, Jun 12, 2024 at 03:20:48PM GMT, Markus Armbruster wrote:

Stefano Garzarella  writes:


shm_open() creates and opens a new POSIX shared memory object.
A POSIX shared memory object allows creating memory backend with an
associated file descriptor that can be shared with external processes
(e.g. vhost-user).

The new `memory-backend-shm` can be used as an alternative when
`memory-backend-memfd` is not available (Linux only), since shm_open()
should be provided by any POSIX-compliant operating system.

This backend mimics memfd, allocating memory that is practically
anonymous. In theory shm_open() requires a name, but this is allocated
for a short time interval and shm_unlink() is called right after
shm_open(). After that, only fd is shared with external processes
(e.g., vhost-user) as if it were associated with anonymous memory.

In the future we may also allow the user to specify the name to be
passed to shm_open(), but for now we keep the backend simple, mimicking
anonymous memory such as memfd.

Acked-by: David Hildenbrand 
Acked-by: Stefan Hajnoczi 
Signed-off-by: Stefano Garzarella 


[...]


diff --git a/qapi/qom.json b/qapi/qom.json
index 9b8f6a7ab5..94e4458288 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -601,8 +601,8 @@
 #
 # @share: if false, the memory is private to QEMU; if true, it is
 # shared (default false for backends memory-backend-file and
-# memory-backend-ram, true for backends memory-backend-epc and
-# memory-backend-memfd)
+# memory-backend-ram, true for backends memory-backend-epc,
+# memory-backend-memfd, and memory-backend-shm)
 #
 # @reserve: if true, reserve swap space (or huge pages) if applicable
 # (default: true) (since 6.1)
@@ -721,6 +721,22 @@
 '*hugetlbsize': 'size',
 '*seal': 'bool' } }

+##
+# @MemoryBackendShmProperties:
+#
+# Properties for memory-backend-shm objects.
+#
+# Setting @share boolean option (defined in the base type) to false
+# will cause a failure during allocation because it is not
+# supported by this backend.


This is QMP reference documentation.  "Failure during allocation" feels
like unnecessary detail there.  Maybe "This memory backend support only
shared memory, which is the default."


I'll fix in v8!




+#
+# Since: 9.1
+##
+{ 'struct': 'MemoryBackendShmProperties',
+  'base': 'MemoryBackendProperties',
+  'data': { },
+  'if': 'CONFIG_POSIX' }
+
 ##
 # @MemoryBackendEpcProperties:
 #
@@ -1049,6 +1065,8 @@
 { 'name': 'memory-backend-memfd',
   'if': 'CONFIG_LINUX' },
 'memory-backend-ram',
+{ 'name': 'memory-backend-shm',
+  'if': 'CONFIG_POSIX' },
 'pef-guest',
 { 'name': 'pr-manager-helper',
   'if': 'CONFIG_LINUX' },
@@ -1121,6 +1139,8 @@
   'memory-backend-memfd':   { 'type': 'MemoryBackendMemfdProperties',
   'if': 'CONFIG_LINUX' },
   'memory-backend-ram': 'MemoryBackendProperties',
+  'memory-backend-shm': { 'type': 'MemoryBackendShmProperties',
+  'if': 'CONFIG_POSIX' },
   'pr-manager-helper':  { 'type': 'PrManagerHelperProperties',
   'if': 'CONFIG_LINUX' },
   'qtest':  'QtestProperties',


[...]

Other than that, QAPI schema
Acked-by: Markus Armbruster 



Thanks for the review!
Stefano




[PATCH RESEND v7 02/12] libvhost-user: fail vu_message_write() if sendmsg() is failing

2024-06-12 Thread Stefano Garzarella
In vu_message_write() we use sendmsg() to send the message header,
then a write() to send the payload.

If sendmsg() fails we should avoid sending the payload, since we
were unable to send the header.

Discovered before fixing the issue with the previous patch, where
sendmsg() failed on macOS due to wrong parameters, but the frontend
still sent the payload which the backend incorrectly interpreted
as a wrong header.

Reviewed-by: Daniel P. Berrangé 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Acked-by: Stefan Hajnoczi 
Reviewed-by: David Hildenbrand 
Signed-off-by: Stefano Garzarella 
---
 subprojects/libvhost-user/libvhost-user.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/subprojects/libvhost-user/libvhost-user.c 
b/subprojects/libvhost-user/libvhost-user.c
index 22bea0c775..a11afd1960 100644
--- a/subprojects/libvhost-user/libvhost-user.c
+++ b/subprojects/libvhost-user/libvhost-user.c
@@ -639,6 +639,11 @@ vu_message_write(VuDev *dev, int conn_fd, VhostUserMsg 
*vmsg)
 rc = sendmsg(conn_fd, , 0);
 } while (rc < 0 && (errno == EINTR || errno == EAGAIN));
 
+if (rc <= 0) {
+vu_panic(dev, "Error while writing: %s", strerror(errno));
+return false;
+}
+
 if (vmsg->size) {
 do {
 if (vmsg->data) {
-- 
2.45.2




[PATCH RESEND v7 01/12] libvhost-user: set msg.msg_control to NULL when it is empty

2024-06-12 Thread Stefano Garzarella
On some OS (e.g. macOS) sendmsg() returns -1 (errno EINVAL) if
the `struct msghdr` has the field `msg_controllen` set to 0, but
`msg_control` is not NULL.

Reviewed-by: Eric Blake 
Reviewed-by: David Hildenbrand 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Acked-by: Stefan Hajnoczi 
Signed-off-by: Stefano Garzarella 
---
 subprojects/libvhost-user/libvhost-user.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/subprojects/libvhost-user/libvhost-user.c 
b/subprojects/libvhost-user/libvhost-user.c
index a879149fef..22bea0c775 100644
--- a/subprojects/libvhost-user/libvhost-user.c
+++ b/subprojects/libvhost-user/libvhost-user.c
@@ -632,6 +632,7 @@ vu_message_write(VuDev *dev, int conn_fd, VhostUserMsg 
*vmsg)
 memcpy(CMSG_DATA(cmsg), vmsg->fds, fdsize);
 } else {
 msg.msg_controllen = 0;
+msg.msg_control = NULL;
 }
 
 do {
-- 
2.45.2




[PATCH RESEND v7 12/12] tests/qtest/vhost-user-test: add a test case for memory-backend-shm

2024-06-12 Thread Stefano Garzarella
`memory-backend-shm` can be used with vhost-user devices, so let's
add a new test case for it.

Acked-by: Thomas Huth 
Acked-by: Stefan Hajnoczi 
Reviewed-by: David Hildenbrand 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Stefano Garzarella 
---
 tests/qtest/vhost-user-test.c | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/tests/qtest/vhost-user-test.c b/tests/qtest/vhost-user-test.c
index d4e437265f..8c1d903b2a 100644
--- a/tests/qtest/vhost-user-test.c
+++ b/tests/qtest/vhost-user-test.c
@@ -44,6 +44,8 @@
 "mem-path=%s,share=on -numa node,memdev=mem"
 #define QEMU_CMD_MEMFD  " -m %d -object memory-backend-memfd,id=mem,size=%dM," 
\
 " -numa node,memdev=mem"
+#define QEMU_CMD_SHM" -m %d -object memory-backend-shm,id=mem,size=%dM," \
+" -numa node,memdev=mem"
 #define QEMU_CMD_CHR" -chardev socket,id=%s,path=%s%s"
 #define QEMU_CMD_NETDEV " -netdev vhost-user,id=hs0,chardev=%s,vhostforce=on"
 
@@ -195,6 +197,7 @@ enum test_memfd {
 TEST_MEMFD_AUTO,
 TEST_MEMFD_YES,
 TEST_MEMFD_NO,
+TEST_MEMFD_SHM,
 };
 
 static void append_vhost_net_opts(TestServer *s, GString *cmd_line,
@@ -228,6 +231,8 @@ static void append_mem_opts(TestServer *server, GString 
*cmd_line,
 
 if (memfd == TEST_MEMFD_YES) {
 g_string_append_printf(cmd_line, QEMU_CMD_MEMFD, size, size);
+} else if (memfd == TEST_MEMFD_SHM) {
+g_string_append_printf(cmd_line, QEMU_CMD_SHM, size, size);
 } else {
 const char *root = init_hugepagefs() ? : server->tmpfs;
 
@@ -788,6 +793,19 @@ static void *vhost_user_test_setup_memfd(GString 
*cmd_line, void *arg)
 return server;
 }
 
+static void *vhost_user_test_setup_shm(GString *cmd_line, void *arg)
+{
+TestServer *server = test_server_new("vhost-user-test", arg);
+test_server_listen(server);
+
+append_mem_opts(server, cmd_line, 256, TEST_MEMFD_SHM);
+server->vu_ops->append_opts(server, cmd_line, "");
+
+g_test_queue_destroy(vhost_user_test_cleanup, server);
+
+return server;
+}
+
 static void test_read_guest_mem(void *obj, void *arg, QGuestAllocator *alloc)
 {
 TestServer *server = arg;
@@ -1081,6 +1099,11 @@ static void register_vhost_user_test(void)
  "virtio-net",
  test_read_guest_mem, );
 
+opts.before = vhost_user_test_setup_shm;
+qos_add_test("vhost-user/read-guest-mem/shm",
+ "virtio-net",
+ test_read_guest_mem, );
+
 if (qemu_memfd_check(MFD_ALLOW_SEALING)) {
 opts.before = vhost_user_test_setup_memfd;
 qos_add_test("vhost-user/read-guest-mem/memfd",
-- 
2.45.2




[PATCH RESEND v7 10/12] hostmem: add a new memory backend based on POSIX shm_open()

2024-06-12 Thread Stefano Garzarella
shm_open() creates and opens a new POSIX shared memory object.
A POSIX shared memory object allows creating memory backend with an
associated file descriptor that can be shared with external processes
(e.g. vhost-user).

The new `memory-backend-shm` can be used as an alternative when
`memory-backend-memfd` is not available (Linux only), since shm_open()
should be provided by any POSIX-compliant operating system.

This backend mimics memfd, allocating memory that is practically
anonymous. In theory shm_open() requires a name, but this is allocated
for a short time interval and shm_unlink() is called right after
shm_open(). After that, only fd is shared with external processes
(e.g., vhost-user) as if it were associated with anonymous memory.

In the future we may also allow the user to specify the name to be
passed to shm_open(), but for now we keep the backend simple, mimicking
anonymous memory such as memfd.

Acked-by: David Hildenbrand 
Acked-by: Stefan Hajnoczi 
Signed-off-by: Stefano Garzarella 
---
v7
- changed default value documentation for @share rebasing on
  20240611130231.83152-1-sgarz...@redhat.com [Markus]
- used `memory-backend-shm` instead of `shm` and wrapped to 70 columns
  [Markus]
- added 'if': 'CONFIG_POSIX' to MemoryBackendShmProperties [Markus]
v5
- fixed documentation in qapi/qom.json and qemu-options.hx [Markus]
v4
- fail if we find "share=off" in shm_backend_memory_alloc() [David]
v3
- enriched commit message and documentation to highlight that we
  want to mimic memfd (David)
---
 docs/system/devices/vhost-user.rst |   5 +-
 qapi/qom.json  |  24 +-
 backends/hostmem-shm.c | 123 +
 backends/meson.build   |   1 +
 qemu-options.hx|  16 
 5 files changed, 165 insertions(+), 4 deletions(-)
 create mode 100644 backends/hostmem-shm.c

diff --git a/docs/system/devices/vhost-user.rst 
b/docs/system/devices/vhost-user.rst
index 9b2da106ce..35259d8ec7 100644
--- a/docs/system/devices/vhost-user.rst
+++ b/docs/system/devices/vhost-user.rst
@@ -98,8 +98,9 @@ Shared memory object
 
 In order for the daemon to access the VirtIO queues to process the
 requests it needs access to the guest's address space. This is
-achieved via the ``memory-backend-file`` or ``memory-backend-memfd``
-objects. A reference to a file-descriptor which can access this object
+achieved via the ``memory-backend-file``, ``memory-backend-memfd``, or
+``memory-backend-shm`` objects.
+A reference to a file-descriptor which can access this object
 will be passed via the socket as part of the protocol negotiation.
 
 Currently the shared memory object needs to match the size of the main
diff --git a/qapi/qom.json b/qapi/qom.json
index 9b8f6a7ab5..94e4458288 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -601,8 +601,8 @@
 #
 # @share: if false, the memory is private to QEMU; if true, it is
 # shared (default false for backends memory-backend-file and
-# memory-backend-ram, true for backends memory-backend-epc and
-# memory-backend-memfd)
+# memory-backend-ram, true for backends memory-backend-epc,
+# memory-backend-memfd, and memory-backend-shm)
 #
 # @reserve: if true, reserve swap space (or huge pages) if applicable
 # (default: true) (since 6.1)
@@ -721,6 +721,22 @@
 '*hugetlbsize': 'size',
 '*seal': 'bool' } }
 
+##
+# @MemoryBackendShmProperties:
+#
+# Properties for memory-backend-shm objects.
+#
+# Setting @share boolean option (defined in the base type) to false
+# will cause a failure during allocation because it is not
+# supported by this backend.
+#
+# Since: 9.1
+##
+{ 'struct': 'MemoryBackendShmProperties',
+  'base': 'MemoryBackendProperties',
+  'data': { },
+  'if': 'CONFIG_POSIX' }
+
 ##
 # @MemoryBackendEpcProperties:
 #
@@ -1049,6 +1065,8 @@
 { 'name': 'memory-backend-memfd',
   'if': 'CONFIG_LINUX' },
 'memory-backend-ram',
+{ 'name': 'memory-backend-shm',
+  'if': 'CONFIG_POSIX' },
 'pef-guest',
 { 'name': 'pr-manager-helper',
   'if': 'CONFIG_LINUX' },
@@ -1121,6 +1139,8 @@
   'memory-backend-memfd':   { 'type': 'MemoryBackendMemfdProperties',
   'if': 'CONFIG_LINUX' },
   'memory-backend-ram': 'MemoryBackendProperties',
+  'memory-backend-shm': { 'type': 'MemoryBackendShmProperties',
+  'if': 'CONFIG_POSIX' },
   'pr-manager-helper':  { 'type': 'PrManagerHelperProperties',
   'if': 'CONFIG_LINUX' },
   'qtest':  'QtestProperties',
diff --git a/backends/hostmem-shm.c b/backends/hostmem-shm.c
new file mode 100644
index 00..374edc3db8
--- /dev/null
+++ b/backends/hostmem-shm.c
@@ -0,0 +1,123 @@
+/*
+ * QEMU host POSIX shared memory object backend
+ *
+ * Copyright (C) 2024 Red Hat Inc
+ *
+ * Authors:
+ *   Stefano Garzarella 
+ *
+ *

[PATCH RESEND v7 11/12] tests/qtest/vhost-user-blk-test: use memory-backend-shm

2024-06-12 Thread Stefano Garzarella
`memory-backend-memfd` is available only on Linux while the new
`memory-backend-shm` can be used on any POSIX-compliant operating
system. Let's use it so we can run the test in multiple environments.

Since we are here, let`s remove `share=on` which is the default for shm
(and also for memfd).

Acked-by: Thomas Huth 
Acked-by: Stefan Hajnoczi 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Reviewed-by: David Hildenbrand 
Signed-off-by: Stefano Garzarella 
---
v6
- removed `share=on` since it's the default [David]
---
 tests/qtest/vhost-user-blk-test.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qtest/vhost-user-blk-test.c 
b/tests/qtest/vhost-user-blk-test.c
index 117b9acd10..ea90d41232 100644
--- a/tests/qtest/vhost-user-blk-test.c
+++ b/tests/qtest/vhost-user-blk-test.c
@@ -906,7 +906,7 @@ static void start_vhost_user_blk(GString *cmd_line, int 
vus_instances,
vhost_user_blk_bin);
 
 g_string_append_printf(cmd_line,
-" -object memory-backend-memfd,id=mem,size=256M,share=on "
+" -object memory-backend-shm,id=mem,size=256M "
 " -M memory-backend=mem -m 256M ");
 
 for (i = 0; i < vus_instances; i++) {
-- 
2.45.2




[PATCH RESEND v7 08/12] libvhost-user: enable it on any POSIX system

2024-06-12 Thread Stefano Garzarella
The vhost-user protocol is not really Linux-specific so let's enable
libvhost-user for any POSIX system.

Compiling it on macOS and FreeBSD some problems came up:
- avoid to include linux/vhost.h which is available only on Linux
  (vhost_types.h contains many of the things we need)
- macOS doesn't provide sys/endian.h, so let's define them
  (note: libvhost-user doesn't include QEMU's headers, so we can't use
   use "qemu/bswap.h")
- define eventfd_[write|read] as write/read wrapper when system doesn't
  provide those (e.g. macOS)
- copy SEAL defines from include/qemu/memfd.h to make the code works
  on FreeBSD where MFD_ALLOW_SEALING is defined
- define MAP_NORESERVE if it's not defined (e.g. on FreeBSD)

Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Acked-by: Stefan Hajnoczi 
Signed-off-by: Stefano Garzarella 
---
v5:
- fixed typos in the commit description [Phil]
---
 meson.build   |  2 +-
 subprojects/libvhost-user/libvhost-user.h |  2 +-
 subprojects/libvhost-user/libvhost-user.c | 60 +--
 3 files changed, 59 insertions(+), 5 deletions(-)

diff --git a/meson.build b/meson.build
index 239044070f..6413e858ea 100644
--- a/meson.build
+++ b/meson.build
@@ -3172,7 +3172,7 @@ if have_system and vfio_user_server_allowed
 endif
 
 vhost_user = not_found
-if host_os == 'linux' and have_vhost_user
+if have_vhost_user
   libvhost_user = subproject('libvhost-user')
   vhost_user = libvhost_user.get_variable('vhost_user_dep')
 endif
diff --git a/subprojects/libvhost-user/libvhost-user.h 
b/subprojects/libvhost-user/libvhost-user.h
index deb40e77b3..e13e1d3931 100644
--- a/subprojects/libvhost-user/libvhost-user.h
+++ b/subprojects/libvhost-user/libvhost-user.h
@@ -18,9 +18,9 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include "standard-headers/linux/virtio_ring.h"
+#include "standard-headers/linux/vhost_types.h"
 
 /* Based on qemu/hw/virtio/vhost-user.c */
 #define VHOST_USER_F_PROTOCOL_FEATURES 30
diff --git a/subprojects/libvhost-user/libvhost-user.c 
b/subprojects/libvhost-user/libvhost-user.c
index 2c20cdc16e..57e58d4adb 100644
--- a/subprojects/libvhost-user/libvhost-user.c
+++ b/subprojects/libvhost-user/libvhost-user.c
@@ -28,9 +28,7 @@
 #include 
 #include 
 #include 
-#include 
 #include 
-#include 
 
 /* Necessary to provide VIRTIO_F_VERSION_1 on system
  * with older linux headers. Must appear before
@@ -39,8 +37,8 @@
 #include "standard-headers/linux/virtio_config.h"
 
 #if defined(__linux__)
+#include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -52,6 +50,62 @@
 
 #endif
 
+#if defined(__APPLE__) && (__MACH__)
+#include 
+#define htobe16(x) OSSwapHostToBigInt16(x)
+#define htole16(x) OSSwapHostToLittleInt16(x)
+#define be16toh(x) OSSwapBigToHostInt16(x)
+#define le16toh(x) OSSwapLittleToHostInt16(x)
+
+#define htobe32(x) OSSwapHostToBigInt32(x)
+#define htole32(x) OSSwapHostToLittleInt32(x)
+#define be32toh(x) OSSwapBigToHostInt32(x)
+#define le32toh(x) OSSwapLittleToHostInt32(x)
+
+#define htobe64(x) OSSwapHostToBigInt64(x)
+#define htole64(x) OSSwapHostToLittleInt64(x)
+#define be64toh(x) OSSwapBigToHostInt64(x)
+#define le64toh(x) OSSwapLittleToHostInt64(x)
+#endif
+
+#ifdef CONFIG_EVENTFD
+#include 
+#else
+#define eventfd_t uint64_t
+
+int eventfd_write(int fd, eventfd_t value)
+{
+return (write(fd, , sizeof(value)) == sizeof(value)) ? 0 : -1;
+}
+
+int eventfd_read(int fd, eventfd_t *value)
+{
+return (read(fd, value, sizeof(*value)) == sizeof(*value)) ? 0 : -1;
+}
+#endif
+
+#ifdef MFD_ALLOW_SEALING
+#include 
+
+#ifndef F_LINUX_SPECIFIC_BASE
+#define F_LINUX_SPECIFIC_BASE 1024
+#endif
+
+#ifndef F_ADD_SEALS
+#define F_ADD_SEALS (F_LINUX_SPECIFIC_BASE + 9)
+#define F_GET_SEALS (F_LINUX_SPECIFIC_BASE + 10)
+
+#define F_SEAL_SEAL 0x0001  /* prevent further seals from being set */
+#define F_SEAL_SHRINK   0x0002  /* prevent file from shrinking */
+#define F_SEAL_GROW 0x0004  /* prevent file from growing */
+#define F_SEAL_WRITE0x0008  /* prevent writes */
+#endif
+#endif
+
+#ifndef MAP_NORESERVE
+#define MAP_NORESERVE 0
+#endif
+
 #include "include/atomic.h"
 
 #include "libvhost-user.h"
-- 
2.45.2




[PATCH RESEND v7 09/12] contrib/vhost-user-blk: enable it on any POSIX system

2024-06-12 Thread Stefano Garzarella
Let's make the code more portable by adding defines from
block/file-posix.c to support O_DIRECT in other systems (e.g. macOS).

vhost-user-server.c is a dependency, let's enable it for any POSIX
system.

Acked-by: Stefan Hajnoczi 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Signed-off-by: Stefano Garzarella 
---
v6:
- reverted v5 changes since we can't move O_DSYNC and O_DIRECT in osdep
  [Daniel, failing tests on Windows]
v5:
- O_DSYNC and O_DIRECT definition are now in osdep [Phil]
- commit updated since we moved out all code changes
v4:
- moved using of "qemu/bswap.h" API in a separate patch [Phil]
---
 meson.build |  2 --
 contrib/vhost-user-blk/vhost-user-blk.c | 14 ++
 util/meson.build|  4 +++-
 3 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/meson.build b/meson.build
index 6413e858ea..8436e0c3d6 100644
--- a/meson.build
+++ b/meson.build
@@ -1985,8 +1985,6 @@ has_statx = cc.has_header_symbol('sys/stat.h', 
'STATX_BASIC_STATS', prefix: gnu_
 has_statx_mnt_id = cc.has_header_symbol('sys/stat.h', 'STATX_MNT_ID', prefix: 
gnu_source_prefix)
 
 have_vhost_user_blk_server = get_option('vhost_user_blk_server') \
-  .require(host_os == 'linux',
-   error_message: 'vhost_user_blk_server requires linux') \
   .require(have_vhost_user,
error_message: 'vhost_user_blk_server requires vhost-user support') 
\
   .disable_auto_if(not have_tools and not have_system) \
diff --git a/contrib/vhost-user-blk/vhost-user-blk.c 
b/contrib/vhost-user-blk/vhost-user-blk.c
index 9492146855..a450337685 100644
--- a/contrib/vhost-user-blk/vhost-user-blk.c
+++ b/contrib/vhost-user-blk/vhost-user-blk.c
@@ -25,6 +25,20 @@
 #include 
 #endif
 
+/* OS X does not have O_DSYNC */
+#ifndef O_DSYNC
+#ifdef O_SYNC
+#define O_DSYNC O_SYNC
+#elif defined(O_FSYNC)
+#define O_DSYNC O_FSYNC
+#endif
+#endif
+
+/* Approximate O_DIRECT with O_DSYNC if O_DIRECT isn't available */
+#ifndef O_DIRECT
+#define O_DIRECT O_DSYNC
+#endif
+
 enum {
 VHOST_USER_BLK_MAX_QUEUES = 8,
 };
diff --git a/util/meson.build b/util/meson.build
index 72b505df11..c414178ace 100644
--- a/util/meson.build
+++ b/util/meson.build
@@ -112,10 +112,12 @@ if have_block
 util_ss.add(files('filemonitor-stub.c'))
   endif
   if host_os == 'linux'
-util_ss.add(files('vhost-user-server.c'), vhost_user)
 util_ss.add(files('vfio-helpers.c'))
 util_ss.add(files('chardev_open.c'))
   endif
+  if host_os != 'windows'
+util_ss.add(files('vhost-user-server.c'), vhost_user)
+  endif
   util_ss.add(files('yank.c'))
 endif
 
-- 
2.45.2




[PATCH RESEND v7 04/12] vhost-user-server: do not set memory fd non-blocking

2024-06-12 Thread Stefano Garzarella
In vhost-user-server we set all fd received from the other peer
in non-blocking mode. For some of them (e.g. memfd, shm_open, etc.)
it's not really needed, because we don't use these fd with blocking
operations, but only to map memory.

In addition, in some systems this operation can fail (e.g. in macOS
setting an fd returned by shm_open() non-blocking fails with errno
= ENOTTY).

So, let's avoid setting fd non-blocking for those messages that we
know carry memory fd (e.g. VHOST_USER_ADD_MEM_REG,
VHOST_USER_SET_MEM_TABLE).

Reviewed-by: Daniel P. Berrangé 
Acked-by: Stefan Hajnoczi 
Reviewed-by: David Hildenbrand 
Signed-off-by: Stefano Garzarella 
---
v3:
- avoiding setting fd non-blocking for messages where we have memory fd
  (Eric)
---
 util/vhost-user-server.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
index 3bfb1ad3ec..b19229074a 100644
--- a/util/vhost-user-server.c
+++ b/util/vhost-user-server.c
@@ -65,6 +65,18 @@ static void vmsg_close_fds(VhostUserMsg *vmsg)
 static void vmsg_unblock_fds(VhostUserMsg *vmsg)
 {
 int i;
+
+/*
+ * These messages carry fd used to map memory, not to send/receive 
messages,
+ * so this operation is useless. In addition, in some systems this
+ * operation can fail (e.g. in macOS setting an fd returned by shm_open()
+ * non-blocking fails with errno = ENOTTY)
+ */
+if (vmsg->request == VHOST_USER_ADD_MEM_REG ||
+vmsg->request == VHOST_USER_SET_MEM_TABLE) {
+return;
+}
+
 for (i = 0; i < vmsg->fd_num; i++) {
 qemu_socket_set_nonblock(vmsg->fds[i]);
 }
-- 
2.45.2




Re: [PATCH v7 00/13] vhost-user: support any POSIX system (tested on macOS, FreeBSD, OpenBSD)

2024-06-12 Thread Stefano Garzarella
Ooops, wrong cover letter. I just resent the whole series.

Sorry for the confusion.
Stefano

On Wed, Jun 12, 2024 at 2:59 PM Stefano Garzarella  wrote:
>
> This series should be in a good shape, in which tree should we queue it?
> @Micheal would your tree be okay?
>
> Thanks,
> Stefano
>
> Changelog
>
> v1: https://patchew.org/QEMU/20240228114759.44758-1-sgarz...@redhat.com/
> v2: https://patchew.org/QEMU/20240326133936.125332-1-sgarz...@redhat.com/
> v3: https://patchew.org/QEMU/20240404122330.92710-1-sgarz...@redhat.com/
> v4: https://patchew.org/QEMU/20240508074457.12367-1-sgarz...@redhat.com/
> v5: https://patchew.org/QEMU/20240523145522.313012-1-sgarz...@redhat.com/
> v6: https://patchew.org/QEMU/20240528103543.145412-1-sgarz...@redhat.com/
> v7:
> - rebased on 
> https://patchew.org/QEMU/20240611130231.83152-1-sgarz...@redhat.com/
>   That patch is queued by Markus and only Patch 10 of this series depends on 
> it.
> - changed default value documentation for @share [Markus]
> - used `memory-backend-shm` instead of `shm` and wrapped to 70 columns
>   [Markus]
> - added 'if': 'CONFIG_POSIX' to MemoryBackendShmProperties [Markus]
>
> Description
>
> The vhost-user protocol is not really Linux-specific, so let's try support
> QEMU's frontends and backends (including libvhost-user) in any POSIX system
> with this series. The main use case is to be able to use virtio devices that
> we don't have built-in in QEMU (e.g. virtiofsd, vhost-user-vsock, etc.) even
> in non-Linux systems.
>
> The first 5 patches are more like fixes discovered at runtime on macOS or
> FreeBSD that could go even independently of this series.
>
> Patches 6, 7, 8, 9 enable building of frontends and backends (including
> libvhost-user) with associated code changes to succeed in compilation.
>
> Patch 10 adds `memory-backend-shm` that uses the POSIX shm_open() API to
> create shared memory which is identified by an fd that can be shared with
> vhost-user backends. This is useful on those systems (like macOS) where
> we don't have memfd_create() or special filesystems like "/dev/shm".
>
> Patches 11 and 12 use `memory-backend-shm` in some vhost-user tests.
>
> Maybe the first 5 patches can go separately, but I only discovered those
> problems after testing patches 6 - 9, so I have included them in this series
> for now. Please let me know if you prefer that I send them separately.
>
> I tested this series using vhost-user-blk and QSD on macOS Sonoma 14.4
> (aarch64), FreeBSD 14 (x86_64), OpenBSD 7.4 (x86_64), and Fedora 40 (x86_64)
> in this way:
>
> - Start vhost-user-blk or QSD (same commands for all systems)
>
>   vhost-user-blk -s /tmp/vhost.socket \
> -b Fedora-Cloud-Base-39-1.5.x86_64.raw
>
>   qemu-storage-daemon \
> --blockdev 
> file,filename=Fedora-Cloud-Base-39-1.5.x86_64.qcow2,node-name=file \
> --blockdev qcow2,file=file,node-name=qcow2 \
> --export 
> vhost-user-blk,addr.type=unix,addr.path=/tmp/vhost.socket,id=vub,num-queues=1,node-name=qcow2,writable=on
>
> - macOS (aarch64): start QEMU (using hvf accelerator)
>
>   qemu-system-aarch64 -smp 2 -cpu host -M virt,accel=hvf,memory-backend=mem \
> -drive 
> file=./build/pc-bios/edk2-aarch64-code.fd,if=pflash,format=raw,readonly=on \
> -device virtio-net-device,netdev=net0 -netdev user,id=net0 \
> -device ramfb -device usb-ehci -device usb-kbd \
> -object memory-backend-shm,id=mem,size=512M \
> -device vhost-user-blk-pci,num-queues=1,disable-legacy=on,chardev=char0 \
> -chardev socket,id=char0,path=/tmp/vhost.socket
>
> - FreeBSD/OpenBSD (x86_64): start QEMU (no accelerators available)
>
>   qemu-system-x86_64 -smp 2 -M q35,memory-backend=mem \
> -object memory-backend-shm,id=mem,size="512M" \
> -device vhost-user-blk-pci,num-queues=1,chardev=char0 \
> -chardev socket,id=char0,path=/tmp/vhost.socket
>
> - Fedora (x86_64): start QEMU (using kvm accelerator)
>
>   qemu-system-x86_64 -smp 2 -M q35,accel=kvm,memory-backend=mem \
> -object memory-backend-shm,size="512M" \
>     -device vhost-user-blk-pci,num-queues=1,chardev=char0 \
> -chardev socket,id=char0,path=/tmp/vhost.socket
>
> Branch pushed (and CI started) at 
> https://gitlab.com/sgarzarella/qemu/-/tree/macos-vhost-user?ref_type=heads
>
> Based-on: 20240611130231.83152-1-sgarz...@redhat.com
>
> Stefano Garzarella (13):
>   qapi: clarify that the default is backend dependent
>   libvhost-user: set msg.msg_control to NULL when it is empty
>   libvhost-user: fail vu_message_write() if sendmsg() is failing
>   libvhost-user: mask F_INFLIGHT_SHMFD if memfd is not supported
>   vhost-user-server: do not set memory fd non-blocking

[PATCH RESEND v7 06/12] contrib/vhost-user-*: use QEMU bswap helper functions

2024-06-12 Thread Stefano Garzarella
Let's replace the calls to le*toh() and htole*() with qemu/bswap.h
helpers to make the code more portable.

Suggested-by: Philippe Mathieu-Daudé 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Acked-by: Stefan Hajnoczi 
Reviewed-by: David Hildenbrand 
Signed-off-by: Stefano Garzarella 
---
 contrib/vhost-user-blk/vhost-user-blk.c |  9 +
 contrib/vhost-user-input/main.c | 16 
 2 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/contrib/vhost-user-blk/vhost-user-blk.c 
b/contrib/vhost-user-blk/vhost-user-blk.c
index a8ab9269a2..9492146855 100644
--- a/contrib/vhost-user-blk/vhost-user-blk.c
+++ b/contrib/vhost-user-blk/vhost-user-blk.c
@@ -16,6 +16,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/bswap.h"
 #include "standard-headers/linux/virtio_blk.h"
 #include "libvhost-user-glib.h"
 
@@ -194,8 +195,8 @@ vub_discard_write_zeroes(VubReq *req, struct iovec *iov, 
uint32_t iovcnt,
 #if defined(__linux__) && defined(BLKDISCARD) && defined(BLKZEROOUT)
 VubDev *vdev_blk = req->vdev_blk;
 desc = buf;
-uint64_t range[2] = { le64toh(desc->sector) << 9,
-  le32toh(desc->num_sectors) << 9 };
+uint64_t range[2] = { le64_to_cpu(desc->sector) << 9,
+  le32_to_cpu(desc->num_sectors) << 9 };
 if (type == VIRTIO_BLK_T_DISCARD) {
 if (ioctl(vdev_blk->blk_fd, BLKDISCARD, range) == 0) {
 g_free(buf);
@@ -267,13 +268,13 @@ static int vub_virtio_process_req(VubDev *vdev_blk,
 req->in = (struct virtio_blk_inhdr *)elem->in_sg[in_num - 1].iov_base;
 in_num--;
 
-type = le32toh(req->out->type);
+type = le32_to_cpu(req->out->type);
 switch (type & ~VIRTIO_BLK_T_BARRIER) {
 case VIRTIO_BLK_T_IN:
 case VIRTIO_BLK_T_OUT: {
 ssize_t ret = 0;
 bool is_write = type & VIRTIO_BLK_T_OUT;
-req->sector_num = le64toh(req->out->sector);
+req->sector_num = le64_to_cpu(req->out->sector);
 if (is_write) {
 ret  = vub_writev(req, >out_sg[1], out_num);
 } else {
diff --git a/contrib/vhost-user-input/main.c b/contrib/vhost-user-input/main.c
index 081230da54..f3362d41ac 100644
--- a/contrib/vhost-user-input/main.c
+++ b/contrib/vhost-user-input/main.c
@@ -51,8 +51,8 @@ static void vi_input_send(VuInput *vi, struct 
virtio_input_event *event)
 vi->queue[vi->qindex++].event = *event;
 
 /* ... until we see a report sync ... */
-if (event->type != htole16(EV_SYN) ||
-event->code != htole16(SYN_REPORT)) {
+if (event->type != cpu_to_le16(EV_SYN) ||
+event->code != cpu_to_le16(SYN_REPORT)) {
 return;
 }
 
@@ -103,9 +103,9 @@ vi_evdev_watch(VuDev *dev, int condition, void *data)
 
 g_debug("input %d %d %d", evdev.type, evdev.code, evdev.value);
 
-virtio.type  = htole16(evdev.type);
-virtio.code  = htole16(evdev.code);
-virtio.value = htole32(evdev.value);
+virtio.type  = cpu_to_le16(evdev.type);
+virtio.code  = cpu_to_le16(evdev.code);
+virtio.value = cpu_to_le32(evdev.value);
 vi_input_send(vi, );
 }
 }
@@ -124,9 +124,9 @@ static void vi_handle_status(VuInput *vi, 
virtio_input_event *event)
 
 evdev.input_event_sec = tval.tv_sec;
 evdev.input_event_usec = tval.tv_usec;
-evdev.type = le16toh(event->type);
-evdev.code = le16toh(event->code);
-evdev.value = le32toh(event->value);
+evdev.type = le16_to_cpu(event->type);
+evdev.code = le16_to_cpu(event->code);
+evdev.value = le32_to_cpu(event->value);
 
 rc = write(vi->evdevfd, , sizeof(evdev));
 if (rc == -1) {
-- 
2.45.2




[PATCH RESEND v7 00/12] vhost-user: support any POSIX system (tested on macOS, FreeBSD, OpenBSD)

2024-06-12 Thread Stefano Garzarella
This series should be in a good shape, in which tree should we queue it?
@Micheal would your tree be okay?

Thanks,
Stefano

Changelog

v1: https://patchew.org/QEMU/20240228114759.44758-1-sgarz...@redhat.com/
v2: https://patchew.org/QEMU/20240326133936.125332-1-sgarz...@redhat.com/
v3: https://patchew.org/QEMU/20240404122330.92710-1-sgarz...@redhat.com/
v4: https://patchew.org/QEMU/20240508074457.12367-1-sgarz...@redhat.com/
v5: https://patchew.org/QEMU/20240523145522.313012-1-sgarz...@redhat.com/
v6: https://patchew.org/QEMU/20240528103543.145412-1-sgarz...@redhat.com/
v7:
- rebased on 
https://patchew.org/QEMU/20240611130231.83152-1-sgarz...@redhat.com/
  That patch is queued by Markus and only Patch 10 of this series depends on it.
- changed default value documentation for @share [Markus]
- used `memory-backend-shm` instead of `shm` and wrapped to 70 columns
  [Markus]
- added 'if': 'CONFIG_POSIX' to MemoryBackendShmProperties [Markus]

Description

The vhost-user protocol is not really Linux-specific, so let's try support
QEMU's frontends and backends (including libvhost-user) in any POSIX system
with this series. The main use case is to be able to use virtio devices that
we don't have built-in in QEMU (e.g. virtiofsd, vhost-user-vsock, etc.) even
in non-Linux systems.

The first 5 patches are more like fixes discovered at runtime on macOS or
FreeBSD that could go even independently of this series.

Patches 6, 7, 8, 9 enable building of frontends and backends (including
libvhost-user) with associated code changes to succeed in compilation.

Patch 10 adds `memory-backend-shm` that uses the POSIX shm_open() API to
create shared memory which is identified by an fd that can be shared with
vhost-user backends. This is useful on those systems (like macOS) where
we don't have memfd_create() or special filesystems like "/dev/shm".

Patches 11 and 12 use `memory-backend-shm` in some vhost-user tests.

Maybe the first 5 patches can go separately, but I only discovered those
problems after testing patches 6 - 9, so I have included them in this series
for now. Please let me know if you prefer that I send them separately.

I tested this series using vhost-user-blk and QSD on macOS Sonoma 14.4
(aarch64), FreeBSD 14 (x86_64), OpenBSD 7.4 (x86_64), and Fedora 40 (x86_64)
in this way:

- Start vhost-user-blk or QSD (same commands for all systems)

  vhost-user-blk -s /tmp/vhost.socket \
-b Fedora-Cloud-Base-39-1.5.x86_64.raw

  qemu-storage-daemon \
--blockdev 
file,filename=Fedora-Cloud-Base-39-1.5.x86_64.qcow2,node-name=file \
--blockdev qcow2,file=file,node-name=qcow2 \
--export 
vhost-user-blk,addr.type=unix,addr.path=/tmp/vhost.socket,id=vub,num-queues=1,node-name=qcow2,writable=on

- macOS (aarch64): start QEMU (using hvf accelerator)

  qemu-system-aarch64 -smp 2 -cpu host -M virt,accel=hvf,memory-backend=mem \
-drive 
file=./build/pc-bios/edk2-aarch64-code.fd,if=pflash,format=raw,readonly=on \
-device virtio-net-device,netdev=net0 -netdev user,id=net0 \
-device ramfb -device usb-ehci -device usb-kbd \
-object memory-backend-shm,id=mem,size=512M \
-device vhost-user-blk-pci,num-queues=1,disable-legacy=on,chardev=char0 \
-chardev socket,id=char0,path=/tmp/vhost.socket

- FreeBSD/OpenBSD (x86_64): start QEMU (no accelerators available)

  qemu-system-x86_64 -smp 2 -M q35,memory-backend=mem \
-object memory-backend-shm,id=mem,size="512M" \
-device vhost-user-blk-pci,num-queues=1,chardev=char0 \
-chardev socket,id=char0,path=/tmp/vhost.socket

- Fedora (x86_64): start QEMU (using kvm accelerator)

  qemu-system-x86_64 -smp 2 -M q35,accel=kvm,memory-backend=mem \
-object memory-backend-shm,size="512M" \
-device vhost-user-blk-pci,num-queues=1,chardev=char0 \
-chardev socket,id=char0,path=/tmp/vhost.socket

Branch pushed (and CI started) at 
https://gitlab.com/sgarzarella/qemu/-/tree/macos-vhost-user?ref_type=heads

Based-on: 20240611130231.83152-1-sgarz...@redhat.com

Stefano Garzarella (12):
  libvhost-user: set msg.msg_control to NULL when it is empty
  libvhost-user: fail vu_message_write() if sendmsg() is failing
  libvhost-user: mask F_INFLIGHT_SHMFD if memfd is not supported
  vhost-user-server: do not set memory fd non-blocking
  contrib/vhost-user-blk: fix bind() using the right size of the address
  contrib/vhost-user-*: use QEMU bswap helper functions
  vhost-user: enable frontends on any POSIX system
  libvhost-user: enable it on any POSIX system
  contrib/vhost-user-blk: enable it on any POSIX system
  hostmem: add a new memory backend based on POSIX shm_open()
  tests/qtest/vhost-user-blk-test: use memory-backend-shm
  tests/qtest/vhost-user-test: add a test case for memory-backend-shm

 docs/system/devices/vhost-user.rst|   5 +-
 meson.build   |   5 +-
 qapi/qom.json |  24 -
 subprojects/libvhost-user/libvhost-user.h | 

[PATCH RESEND v7 07/12] vhost-user: enable frontends on any POSIX system

2024-06-12 Thread Stefano Garzarella
The vhost-user protocol is not really Linux-specific so let's enable
vhost-user frontends for any POSIX system.

In vhost_net.c we use VHOST_FILE_UNBIND which is defined in a Linux
specific header, let's define it for other systems as well.

Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Acked-by: Stefan Hajnoczi 
Reviewed-by: David Hildenbrand 
Signed-off-by: Stefano Garzarella 
---
 meson.build| 1 -
 hw/net/vhost_net.c | 5 +
 hw/block/Kconfig   | 2 +-
 3 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/meson.build b/meson.build
index ec59effca2..239044070f 100644
--- a/meson.build
+++ b/meson.build
@@ -151,7 +151,6 @@ have_tpm = get_option('tpm') \
 
 # vhost
 have_vhost_user = get_option('vhost_user') \
-  .disable_auto_if(host_os != 'linux') \
   .require(host_os != 'windows',
error_message: 'vhost-user is not available on Windows').allowed()
 have_vhost_vdpa = get_option('vhost_vdpa') \
diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index fd1a93701a..fced429813 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -34,8 +34,13 @@
 #include "standard-headers/linux/virtio_ring.h"
 #include "hw/virtio/vhost.h"
 #include "hw/virtio/virtio-bus.h"
+#if defined(__linux__)
 #include "linux-headers/linux/vhost.h"
+#endif
 
+#ifndef VHOST_FILE_UNBIND
+#define VHOST_FILE_UNBIND -1
+#endif
 
 /* Features supported by host kernel. */
 static const int kernel_feature_bits[] = {
diff --git a/hw/block/Kconfig b/hw/block/Kconfig
index 9e8f28f982..29ee09e434 100644
--- a/hw/block/Kconfig
+++ b/hw/block/Kconfig
@@ -40,7 +40,7 @@ config VHOST_USER_BLK
 bool
 # Only PCI devices are provided for now
 default y if VIRTIO_PCI
-depends on VIRTIO && VHOST_USER && LINUX
+depends on VIRTIO && VHOST_USER
 
 config SWIM
 bool
-- 
2.45.2




[PATCH RESEND v7 05/12] contrib/vhost-user-blk: fix bind() using the right size of the address

2024-06-12 Thread Stefano Garzarella
On macOS passing `-s /tmp/vhost.socket` parameter to the vhost-user-blk
application, the bind was done on `/tmp/vhost.socke` pathname,
missing the last character.

This sounds like one of the portability problems described in the
unix(7) manpage:

   Pathname sockets
   When  binding  a socket to a pathname, a few rules should
   be observed for maximum portability and ease of coding:

   •  The pathname in sun_path should be null-terminated.

   •  The length of the pathname, including the  terminating
  null byte, should not exceed the size of sun_path.

   •  The  addrlen  argument  that  describes  the enclosing
  sockaddr_un structure should have a value of at least:

  offsetof(struct sockaddr_un, sun_path) +
  strlen(addr.sun_path)+1

  or,  more  simply,  addrlen  can   be   specified   as
  sizeof(struct sockaddr_un).

So let's follow the last advice and simplify the code as well.

Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Acked-by: Stefan Hajnoczi 
Reviewed-by: David Hildenbrand 
Signed-off-by: Stefano Garzarella 
---
 contrib/vhost-user-blk/vhost-user-blk.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/contrib/vhost-user-blk/vhost-user-blk.c 
b/contrib/vhost-user-blk/vhost-user-blk.c
index 89e5f11a64..a8ab9269a2 100644
--- a/contrib/vhost-user-blk/vhost-user-blk.c
+++ b/contrib/vhost-user-blk/vhost-user-blk.c
@@ -469,7 +469,6 @@ static int unix_sock_new(char *unix_fn)
 {
 int sock;
 struct sockaddr_un un;
-size_t len;
 
 assert(unix_fn);
 
@@ -481,10 +480,9 @@ static int unix_sock_new(char *unix_fn)
 
 un.sun_family = AF_UNIX;
 (void)snprintf(un.sun_path, sizeof(un.sun_path), "%s", unix_fn);
-len = sizeof(un.sun_family) + strlen(un.sun_path);
 
 (void)unlink(unix_fn);
-if (bind(sock, (struct sockaddr *), len) < 0) {
+if (bind(sock, (struct sockaddr *), sizeof(un)) < 0) {
 perror("bind");
 goto fail;
 }
-- 
2.45.2




[PATCH RESEND v7 03/12] libvhost-user: mask F_INFLIGHT_SHMFD if memfd is not supported

2024-06-12 Thread Stefano Garzarella
libvhost-user will panic when receiving VHOST_USER_GET_INFLIGHT_FD
message if MFD_ALLOW_SEALING is not defined, since it's not able
to create a memfd.

VHOST_USER_GET_INFLIGHT_FD is used only if
VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD is negotiated. So, let's mask
that feature if the backend is not able to properly handle these
messages.

Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Acked-by: Stefan Hajnoczi 
Reviewed-by: David Hildenbrand 
Signed-off-by: Stefano Garzarella 
---
 subprojects/libvhost-user/libvhost-user.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/subprojects/libvhost-user/libvhost-user.c 
b/subprojects/libvhost-user/libvhost-user.c
index a11afd1960..2c20cdc16e 100644
--- a/subprojects/libvhost-user/libvhost-user.c
+++ b/subprojects/libvhost-user/libvhost-user.c
@@ -1674,6 +1674,17 @@ vu_get_protocol_features_exec(VuDev *dev, VhostUserMsg 
*vmsg)
 features |= dev->iface->get_protocol_features(dev);
 }
 
+#ifndef MFD_ALLOW_SEALING
+/*
+ * If MFD_ALLOW_SEALING is not defined, we are not able to handle
+ * VHOST_USER_GET_INFLIGHT_FD messages, since we can't create a memfd.
+ * Those messages are used only if VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD
+ * is negotiated. A device implementation can enable it, so let's mask
+ * it to avoid a runtime panic.
+ */
+features &= ~(1ULL << VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD);
+#endif
+
 vmsg_set_reply_u64(vmsg, features);
 return true;
 }
-- 
2.45.2




[PATCH v7 00/13] vhost-user: support any POSIX system (tested on macOS, FreeBSD, OpenBSD)

2024-06-12 Thread Stefano Garzarella
This series should be in a good shape, in which tree should we queue it?
@Micheal would your tree be okay?

Thanks,
Stefano

Changelog

v1: https://patchew.org/QEMU/20240228114759.44758-1-sgarz...@redhat.com/
v2: https://patchew.org/QEMU/20240326133936.125332-1-sgarz...@redhat.com/
v3: https://patchew.org/QEMU/20240404122330.92710-1-sgarz...@redhat.com/
v4: https://patchew.org/QEMU/20240508074457.12367-1-sgarz...@redhat.com/
v5: https://patchew.org/QEMU/20240523145522.313012-1-sgarz...@redhat.com/
v6: https://patchew.org/QEMU/20240528103543.145412-1-sgarz...@redhat.com/
v7:
- rebased on 
https://patchew.org/QEMU/20240611130231.83152-1-sgarz...@redhat.com/
  That patch is queued by Markus and only Patch 10 of this series depends on it.
- changed default value documentation for @share [Markus]
- used `memory-backend-shm` instead of `shm` and wrapped to 70 columns
  [Markus]
- added 'if': 'CONFIG_POSIX' to MemoryBackendShmProperties [Markus]

Description

The vhost-user protocol is not really Linux-specific, so let's try support
QEMU's frontends and backends (including libvhost-user) in any POSIX system
with this series. The main use case is to be able to use virtio devices that
we don't have built-in in QEMU (e.g. virtiofsd, vhost-user-vsock, etc.) even
in non-Linux systems.

The first 5 patches are more like fixes discovered at runtime on macOS or
FreeBSD that could go even independently of this series.

Patches 6, 7, 8, 9 enable building of frontends and backends (including
libvhost-user) with associated code changes to succeed in compilation.

Patch 10 adds `memory-backend-shm` that uses the POSIX shm_open() API to
create shared memory which is identified by an fd that can be shared with
vhost-user backends. This is useful on those systems (like macOS) where
we don't have memfd_create() or special filesystems like "/dev/shm".

Patches 11 and 12 use `memory-backend-shm` in some vhost-user tests.

Maybe the first 5 patches can go separately, but I only discovered those
problems after testing patches 6 - 9, so I have included them in this series
for now. Please let me know if you prefer that I send them separately.

I tested this series using vhost-user-blk and QSD on macOS Sonoma 14.4
(aarch64), FreeBSD 14 (x86_64), OpenBSD 7.4 (x86_64), and Fedora 40 (x86_64)
in this way:

- Start vhost-user-blk or QSD (same commands for all systems)

  vhost-user-blk -s /tmp/vhost.socket \
-b Fedora-Cloud-Base-39-1.5.x86_64.raw

  qemu-storage-daemon \
--blockdev 
file,filename=Fedora-Cloud-Base-39-1.5.x86_64.qcow2,node-name=file \
--blockdev qcow2,file=file,node-name=qcow2 \
--export 
vhost-user-blk,addr.type=unix,addr.path=/tmp/vhost.socket,id=vub,num-queues=1,node-name=qcow2,writable=on

- macOS (aarch64): start QEMU (using hvf accelerator)

  qemu-system-aarch64 -smp 2 -cpu host -M virt,accel=hvf,memory-backend=mem \
-drive 
file=./build/pc-bios/edk2-aarch64-code.fd,if=pflash,format=raw,readonly=on \
-device virtio-net-device,netdev=net0 -netdev user,id=net0 \
-device ramfb -device usb-ehci -device usb-kbd \
-object memory-backend-shm,id=mem,size=512M \
-device vhost-user-blk-pci,num-queues=1,disable-legacy=on,chardev=char0 \
-chardev socket,id=char0,path=/tmp/vhost.socket

- FreeBSD/OpenBSD (x86_64): start QEMU (no accelerators available)

  qemu-system-x86_64 -smp 2 -M q35,memory-backend=mem \
-object memory-backend-shm,id=mem,size="512M" \
-device vhost-user-blk-pci,num-queues=1,chardev=char0 \
-chardev socket,id=char0,path=/tmp/vhost.socket

- Fedora (x86_64): start QEMU (using kvm accelerator)

  qemu-system-x86_64 -smp 2 -M q35,accel=kvm,memory-backend=mem \
-object memory-backend-shm,size="512M" \
-device vhost-user-blk-pci,num-queues=1,chardev=char0 \
-chardev socket,id=char0,path=/tmp/vhost.socket

Branch pushed (and CI started) at 
https://gitlab.com/sgarzarella/qemu/-/tree/macos-vhost-user?ref_type=heads

Based-on: 20240611130231.83152-1-sgarz...@redhat.com

Stefano Garzarella (13):
  qapi: clarify that the default is backend dependent
  libvhost-user: set msg.msg_control to NULL when it is empty
  libvhost-user: fail vu_message_write() if sendmsg() is failing
  libvhost-user: mask F_INFLIGHT_SHMFD if memfd is not supported
  vhost-user-server: do not set memory fd non-blocking
  contrib/vhost-user-blk: fix bind() using the right size of the address
  contrib/vhost-user-*: use QEMU bswap helper functions
  vhost-user: enable frontends on any POSIX system
  libvhost-user: enable it on any POSIX system
  contrib/vhost-user-blk: enable it on any POSIX system
  hostmem: add a new memory backend based on POSIX shm_open()
  tests/qtest/vhost-user-blk-test: use memory-backend-shm
  tests/qtest/vhost-user-test: add a test case for memory-backend-shm

 docs/system/devices/vhost-user.rst|   5 +-
 meson.build   |   5 +-
 qapi/qom.json |  28 -
 subprojects

Re: [PATCH v6 10/12] hostmem: add a new memory backend based on POSIX shm_open()

2024-06-04 Thread Stefano Garzarella

On Mon, Jun 03, 2024 at 11:42:35AM GMT, Markus Armbruster wrote:

Stefano Garzarella  writes:


On Wed, May 29, 2024 at 04:50:20PM GMT, Markus Armbruster wrote:

Stefano Garzarella  writes:


shm_open() creates and opens a new POSIX shared memory object.
A POSIX shared memory object allows creating memory backend with an
associated file descriptor that can be shared with external processes
(e.g. vhost-user).

The new `memory-backend-shm` can be used as an alternative when
`memory-backend-memfd` is not available (Linux only), since shm_open()
should be provided by any POSIX-compliant operating system.

This backend mimics memfd, allocating memory that is practically
anonymous. In theory shm_open() requires a name, but this is allocated
for a short time interval and shm_unlink() is called right after
shm_open(). After that, only fd is shared with external processes
(e.g., vhost-user) as if it were associated with anonymous memory.

In the future we may also allow the user to specify the name to be
passed to shm_open(), but for now we keep the backend simple, mimicking
anonymous memory such as memfd.

Acked-by: David Hildenbrand 
Acked-by: Stefan Hajnoczi 
Signed-off-by: Stefano Garzarella 
---
v5
- fixed documentation in qapi/qom.json and qemu-options.hx [Markus]
v4
- fail if we find "share=off" in shm_backend_memory_alloc() [David]
v3
- enriched commit message and documentation to highlight that we
  want to mimic memfd (David)
---
 docs/system/devices/vhost-user.rst |   5 +-
 qapi/qom.json  |  19 +
 backends/hostmem-shm.c | 123 +
 backends/meson.build   |   1 +
 qemu-options.hx|  16 
 5 files changed, 162 insertions(+), 2 deletions(-)
 create mode 100644 backends/hostmem-shm.c

diff --git a/docs/system/devices/vhost-user.rst 
b/docs/system/devices/vhost-user.rst
index 9b2da106ce..35259d8ec7 100644
--- a/docs/system/devices/vhost-user.rst
+++ b/docs/system/devices/vhost-user.rst
@@ -98,8 +98,9 @@ Shared memory object

 In order for the daemon to access the VirtIO queues to process the
 requests it needs access to the guest's address space. This is
-achieved via the ``memory-backend-file`` or ``memory-backend-memfd``
-objects. A reference to a file-descriptor which can access this object
+achieved via the ``memory-backend-file``, ``memory-backend-memfd``, or
+``memory-backend-shm`` objects.
+A reference to a file-descriptor which can access this object
 will be passed via the socket as part of the protocol negotiation.

 Currently the shared memory object needs to match the size of the main
diff --git a/qapi/qom.json b/qapi/qom.json
index 38dde6d785..d40592d863 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -721,6 +721,21 @@
 '*hugetlbsize': 'size',
 '*seal': 'bool' } }

+##
+# @MemoryBackendShmProperties:
+#
+# Properties for memory-backend-shm objects.
+#
+# The @share boolean option is true by default with shm. Setting it to false
+# will cause a failure during allocation because it is not supported by this
+# backend.


docs/devel/qapi-code-gen.rst:

   For legibility, wrap text paragraphs so every line is at most 70
   characters long.

   Separate sentences with two spaces.

Result:

  # Properties for memory-backend-shm objects.
  #
  # The @share boolean option is true by default with shm.  Setting it
  # to false will cause a failure during allocation because it is not
  # supported by this backend.


Ops, sorry, I'll fix!



However, this contradicts the doc comment for @share:

  # @share: if false, the memory is private to QEMU; if true, it is
  # shared (default: false)

Your intention is to override that text.  But that's less than clear.
Moreover, the documentation of @share is pretty far from this override.
John Snow is working on patches that'll pull it closer.

Hmm, MemoryBackendMemfdProperties has the same override.

I think we should change the doc comment for @share to something like

  # @share: if false, the memory is private to QEMU; if true, it is
  # shared (default depends on the backend type)

and then document the actual default with each backend type.


Yes, I had already seen your comment to an earlier version and sent another 
separate patch:
https://patchew.org/QEMU/20240523133302.103858-1-sgarz...@redhat.com/

Is that okay?


Looks like I'm going through my post-vacation review backlog in
suboptimal order...

Replied there!


Thanks!




+#
+# Since: 9.1
+##
+{ 'struct': 'MemoryBackendShmProperties',
+  'base': 'MemoryBackendProperties',
+  'data': { } }


Let's add 'if': 'CONFIG_POSIX' here.



I think my response to your review at v4 fell through a crack :-)
https://patchew.org/QEMU/20240508074457.12367-1-sgarz...@redhat.com/20240508074457.12367-11-sgarz...@redhat.com/#z3lbtmkn6zlwdhdea7owav3mblttxr3asrmlilwxmkla67tdby@732gn3uuupoq


Dang, it did %-}


I'll bring back my doubts here:

  Do you mean something like this:

 

Re: [PATCH v6 10/12] hostmem: add a new memory backend based on POSIX shm_open()

2024-05-29 Thread Stefano Garzarella

On Wed, May 29, 2024 at 04:50:20PM GMT, Markus Armbruster wrote:

Stefano Garzarella  writes:


shm_open() creates and opens a new POSIX shared memory object.
A POSIX shared memory object allows creating memory backend with an
associated file descriptor that can be shared with external processes
(e.g. vhost-user).

The new `memory-backend-shm` can be used as an alternative when
`memory-backend-memfd` is not available (Linux only), since shm_open()
should be provided by any POSIX-compliant operating system.

This backend mimics memfd, allocating memory that is practically
anonymous. In theory shm_open() requires a name, but this is allocated
for a short time interval and shm_unlink() is called right after
shm_open(). After that, only fd is shared with external processes
(e.g., vhost-user) as if it were associated with anonymous memory.

In the future we may also allow the user to specify the name to be
passed to shm_open(), but for now we keep the backend simple, mimicking
anonymous memory such as memfd.

Acked-by: David Hildenbrand 
Acked-by: Stefan Hajnoczi 
Signed-off-by: Stefano Garzarella 
---
v5
- fixed documentation in qapi/qom.json and qemu-options.hx [Markus]
v4
- fail if we find "share=off" in shm_backend_memory_alloc() [David]
v3
- enriched commit message and documentation to highlight that we
  want to mimic memfd (David)
---
 docs/system/devices/vhost-user.rst |   5 +-
 qapi/qom.json  |  19 +
 backends/hostmem-shm.c | 123 +
 backends/meson.build   |   1 +
 qemu-options.hx|  16 
 5 files changed, 162 insertions(+), 2 deletions(-)
 create mode 100644 backends/hostmem-shm.c

diff --git a/docs/system/devices/vhost-user.rst 
b/docs/system/devices/vhost-user.rst
index 9b2da106ce..35259d8ec7 100644
--- a/docs/system/devices/vhost-user.rst
+++ b/docs/system/devices/vhost-user.rst
@@ -98,8 +98,9 @@ Shared memory object

 In order for the daemon to access the VirtIO queues to process the
 requests it needs access to the guest's address space. This is
-achieved via the ``memory-backend-file`` or ``memory-backend-memfd``
-objects. A reference to a file-descriptor which can access this object
+achieved via the ``memory-backend-file``, ``memory-backend-memfd``, or
+``memory-backend-shm`` objects.
+A reference to a file-descriptor which can access this object
 will be passed via the socket as part of the protocol negotiation.

 Currently the shared memory object needs to match the size of the main
diff --git a/qapi/qom.json b/qapi/qom.json
index 38dde6d785..d40592d863 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -721,6 +721,21 @@
 '*hugetlbsize': 'size',
 '*seal': 'bool' } }

+##
+# @MemoryBackendShmProperties:
+#
+# Properties for memory-backend-shm objects.
+#
+# The @share boolean option is true by default with shm. Setting it to false
+# will cause a failure during allocation because it is not supported by this
+# backend.


docs/devel/qapi-code-gen.rst:

   For legibility, wrap text paragraphs so every line is at most 70
   characters long.

   Separate sentences with two spaces.

Result:

  # Properties for memory-backend-shm objects.
  #
  # The @share boolean option is true by default with shm.  Setting it
  # to false will cause a failure during allocation because it is not
  # supported by this backend.


Ops, sorry, I'll fix!



However, this contradicts the doc comment for @share:

  # @share: if false, the memory is private to QEMU; if true, it is
  # shared (default: false)

Your intention is to override that text.  But that's less than clear.
Moreover, the documentation of @share is pretty far from this override.
John Snow is working on patches that'll pull it closer.

Hmm, MemoryBackendMemfdProperties has the same override.

I think we should change the doc comment for @share to something like

  # @share: if false, the memory is private to QEMU; if true, it is
  # shared (default depends on the backend type)

and then document the actual default with each backend type.


Yes, I had already seen your comment to an earlier version and sent 
another separate patch:

https://patchew.org/QEMU/20240523133302.103858-1-sgarz...@redhat.com/

Is that okay?




+#
+# Since: 9.1
+##
+{ 'struct': 'MemoryBackendShmProperties',
+  'base': 'MemoryBackendProperties',
+  'data': { } }


Let's add 'if': 'CONFIG_POSIX' here.



I think my response to your review at v4 fell through a crack :-)
https://patchew.org/QEMU/20240508074457.12367-1-sgarz...@redhat.com/20240508074457.12367-11-sgarz...@redhat.com/#z3lbtmkn6zlwdhdea7owav3mblttxr3asrmlilwxmkla67tdby@732gn3uuupoq

I'll bring back my doubts here:

  Do you mean something like this:

  { 'struct': 'MemoryBackendShmProperties',
 'if': 'CONFIG_POSIX',
 'base': 'MemoryBackendProperties',
 'data': { } }

  I didn't because for MemoryBackendMemfdProperties and
  MemoryBackendEpcProperties we have 'if': 'CO

[PATCH v6 10/12] hostmem: add a new memory backend based on POSIX shm_open()

2024-05-28 Thread Stefano Garzarella
shm_open() creates and opens a new POSIX shared memory object.
A POSIX shared memory object allows creating memory backend with an
associated file descriptor that can be shared with external processes
(e.g. vhost-user).

The new `memory-backend-shm` can be used as an alternative when
`memory-backend-memfd` is not available (Linux only), since shm_open()
should be provided by any POSIX-compliant operating system.

This backend mimics memfd, allocating memory that is practically
anonymous. In theory shm_open() requires a name, but this is allocated
for a short time interval and shm_unlink() is called right after
shm_open(). After that, only fd is shared with external processes
(e.g., vhost-user) as if it were associated with anonymous memory.

In the future we may also allow the user to specify the name to be
passed to shm_open(), but for now we keep the backend simple, mimicking
anonymous memory such as memfd.

Acked-by: David Hildenbrand 
Acked-by: Stefan Hajnoczi 
Signed-off-by: Stefano Garzarella 
---
v5
- fixed documentation in qapi/qom.json and qemu-options.hx [Markus]
v4
- fail if we find "share=off" in shm_backend_memory_alloc() [David]
v3
- enriched commit message and documentation to highlight that we
  want to mimic memfd (David)
---
 docs/system/devices/vhost-user.rst |   5 +-
 qapi/qom.json  |  19 +
 backends/hostmem-shm.c | 123 +
 backends/meson.build   |   1 +
 qemu-options.hx|  16 
 5 files changed, 162 insertions(+), 2 deletions(-)
 create mode 100644 backends/hostmem-shm.c

diff --git a/docs/system/devices/vhost-user.rst 
b/docs/system/devices/vhost-user.rst
index 9b2da106ce..35259d8ec7 100644
--- a/docs/system/devices/vhost-user.rst
+++ b/docs/system/devices/vhost-user.rst
@@ -98,8 +98,9 @@ Shared memory object
 
 In order for the daemon to access the VirtIO queues to process the
 requests it needs access to the guest's address space. This is
-achieved via the ``memory-backend-file`` or ``memory-backend-memfd``
-objects. A reference to a file-descriptor which can access this object
+achieved via the ``memory-backend-file``, ``memory-backend-memfd``, or
+``memory-backend-shm`` objects.
+A reference to a file-descriptor which can access this object
 will be passed via the socket as part of the protocol negotiation.
 
 Currently the shared memory object needs to match the size of the main
diff --git a/qapi/qom.json b/qapi/qom.json
index 38dde6d785..d40592d863 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -721,6 +721,21 @@
 '*hugetlbsize': 'size',
 '*seal': 'bool' } }
 
+##
+# @MemoryBackendShmProperties:
+#
+# Properties for memory-backend-shm objects.
+#
+# The @share boolean option is true by default with shm. Setting it to false
+# will cause a failure during allocation because it is not supported by this
+# backend.
+#
+# Since: 9.1
+##
+{ 'struct': 'MemoryBackendShmProperties',
+  'base': 'MemoryBackendProperties',
+  'data': { } }
+
 ##
 # @MemoryBackendEpcProperties:
 #
@@ -985,6 +1000,8 @@
 { 'name': 'memory-backend-memfd',
   'if': 'CONFIG_LINUX' },
 'memory-backend-ram',
+{ 'name': 'memory-backend-shm',
+  'if': 'CONFIG_POSIX' },
 'pef-guest',
 { 'name': 'pr-manager-helper',
   'if': 'CONFIG_LINUX' },
@@ -1056,6 +1073,8 @@
   'memory-backend-memfd':   { 'type': 'MemoryBackendMemfdProperties',
   'if': 'CONFIG_LINUX' },
   'memory-backend-ram': 'MemoryBackendProperties',
+  'memory-backend-shm': { 'type': 'MemoryBackendShmProperties',
+  'if': 'CONFIG_POSIX' },
   'pr-manager-helper':  { 'type': 'PrManagerHelperProperties',
   'if': 'CONFIG_LINUX' },
   'qtest':  'QtestProperties',
diff --git a/backends/hostmem-shm.c b/backends/hostmem-shm.c
new file mode 100644
index 00..374edc3db8
--- /dev/null
+++ b/backends/hostmem-shm.c
@@ -0,0 +1,123 @@
+/*
+ * QEMU host POSIX shared memory object backend
+ *
+ * Copyright (C) 2024 Red Hat Inc
+ *
+ * Authors:
+ *   Stefano Garzarella 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "sysemu/hostmem.h"
+#include "qapi/error.h"
+
+#define TYPE_MEMORY_BACKEND_SHM "memory-backend-shm"
+
+OBJECT_DECLARE_SIMPLE_TYPE(HostMemoryBackendShm, MEMORY_BACKEND_SHM)
+
+struct HostMemoryBackendShm {
+HostMemoryBackend parent_obj;
+};
+
+static bool
+shm_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
+{
+g_autoptr(GString) shm_name = g_string_new(NULL);
+g_autofree char *backend_name = NULL;
+uint32_t ram_flags;
+int fd, oflag;
+mode_t mode;
+
+if (!backend->size) {
+error_setg(errp, "can

[PATCH v6 11/12] tests/qtest/vhost-user-blk-test: use memory-backend-shm

2024-05-28 Thread Stefano Garzarella
`memory-backend-memfd` is available only on Linux while the new
`memory-backend-shm` can be used on any POSIX-compliant operating
system. Let's use it so we can run the test in multiple environments.

Since we are here, let`s remove `share=on` which is the default for shm
(and also for memfd).

Acked-by: Thomas Huth 
Acked-by: Stefan Hajnoczi 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Reviewed-by: David Hildenbrand 
Signed-off-by: Stefano Garzarella 
---
v6
- removed `share=on` since it's the default [David]
---
 tests/qtest/vhost-user-blk-test.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qtest/vhost-user-blk-test.c 
b/tests/qtest/vhost-user-blk-test.c
index 117b9acd10..ea90d41232 100644
--- a/tests/qtest/vhost-user-blk-test.c
+++ b/tests/qtest/vhost-user-blk-test.c
@@ -906,7 +906,7 @@ static void start_vhost_user_blk(GString *cmd_line, int 
vus_instances,
vhost_user_blk_bin);
 
 g_string_append_printf(cmd_line,
-" -object memory-backend-memfd,id=mem,size=256M,share=on "
+" -object memory-backend-shm,id=mem,size=256M "
 " -M memory-backend=mem -m 256M ");
 
 for (i = 0; i < vus_instances; i++) {
-- 
2.45.1




[PATCH v6 12/12] tests/qtest/vhost-user-test: add a test case for memory-backend-shm

2024-05-28 Thread Stefano Garzarella
`memory-backend-shm` can be used with vhost-user devices, so let's
add a new test case for it.

Acked-by: Thomas Huth 
Acked-by: Stefan Hajnoczi 
Reviewed-by: David Hildenbrand 
Signed-off-by: Stefano Garzarella 
---
 tests/qtest/vhost-user-test.c | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/tests/qtest/vhost-user-test.c b/tests/qtest/vhost-user-test.c
index d4e437265f..8c1d903b2a 100644
--- a/tests/qtest/vhost-user-test.c
+++ b/tests/qtest/vhost-user-test.c
@@ -44,6 +44,8 @@
 "mem-path=%s,share=on -numa node,memdev=mem"
 #define QEMU_CMD_MEMFD  " -m %d -object memory-backend-memfd,id=mem,size=%dM," 
\
 " -numa node,memdev=mem"
+#define QEMU_CMD_SHM" -m %d -object memory-backend-shm,id=mem,size=%dM," \
+" -numa node,memdev=mem"
 #define QEMU_CMD_CHR" -chardev socket,id=%s,path=%s%s"
 #define QEMU_CMD_NETDEV " -netdev vhost-user,id=hs0,chardev=%s,vhostforce=on"
 
@@ -195,6 +197,7 @@ enum test_memfd {
 TEST_MEMFD_AUTO,
 TEST_MEMFD_YES,
 TEST_MEMFD_NO,
+TEST_MEMFD_SHM,
 };
 
 static void append_vhost_net_opts(TestServer *s, GString *cmd_line,
@@ -228,6 +231,8 @@ static void append_mem_opts(TestServer *server, GString 
*cmd_line,
 
 if (memfd == TEST_MEMFD_YES) {
 g_string_append_printf(cmd_line, QEMU_CMD_MEMFD, size, size);
+} else if (memfd == TEST_MEMFD_SHM) {
+g_string_append_printf(cmd_line, QEMU_CMD_SHM, size, size);
 } else {
 const char *root = init_hugepagefs() ? : server->tmpfs;
 
@@ -788,6 +793,19 @@ static void *vhost_user_test_setup_memfd(GString 
*cmd_line, void *arg)
 return server;
 }
 
+static void *vhost_user_test_setup_shm(GString *cmd_line, void *arg)
+{
+TestServer *server = test_server_new("vhost-user-test", arg);
+test_server_listen(server);
+
+append_mem_opts(server, cmd_line, 256, TEST_MEMFD_SHM);
+server->vu_ops->append_opts(server, cmd_line, "");
+
+g_test_queue_destroy(vhost_user_test_cleanup, server);
+
+return server;
+}
+
 static void test_read_guest_mem(void *obj, void *arg, QGuestAllocator *alloc)
 {
 TestServer *server = arg;
@@ -1081,6 +1099,11 @@ static void register_vhost_user_test(void)
  "virtio-net",
  test_read_guest_mem, );
 
+opts.before = vhost_user_test_setup_shm;
+qos_add_test("vhost-user/read-guest-mem/shm",
+ "virtio-net",
+ test_read_guest_mem, );
+
 if (qemu_memfd_check(MFD_ALLOW_SEALING)) {
 opts.before = vhost_user_test_setup_memfd;
 qos_add_test("vhost-user/read-guest-mem/memfd",
-- 
2.45.1




[PATCH v6 09/12] contrib/vhost-user-blk: enable it on any POSIX system

2024-05-28 Thread Stefano Garzarella
Let's make the code more portable by adding defines from
block/file-posix.c to support O_DIRECT in other systems (e.g. macOS).

vhost-user-server.c is a dependency, let's enable it for any POSIX
system.

Acked-by: Stefan Hajnoczi 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Signed-off-by: Stefano Garzarella 
---
v6:
- reverted v5 changes since we can't move O_DSYNC and O_DIRECT in osdep
  [Daniel, failing tests on Windows]
v5:
- O_DSYNC and O_DIRECT definition are now in osdep [Phil]
- commit updated since we moved out all code changes
v4:
- moved using of "qemu/bswap.h" API in a separate patch [Phil]
---
 meson.build |  2 --
 contrib/vhost-user-blk/vhost-user-blk.c | 14 ++
 util/meson.build|  4 +++-
 3 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/meson.build b/meson.build
index 48e476b237..c89ee7b578 100644
--- a/meson.build
+++ b/meson.build
@@ -1981,8 +1981,6 @@ has_statx = cc.has_header_symbol('sys/stat.h', 
'STATX_BASIC_STATS', prefix: gnu_
 has_statx_mnt_id = cc.has_header_symbol('sys/stat.h', 'STATX_MNT_ID', prefix: 
gnu_source_prefix)
 
 have_vhost_user_blk_server = get_option('vhost_user_blk_server') \
-  .require(host_os == 'linux',
-   error_message: 'vhost_user_blk_server requires linux') \
   .require(have_vhost_user,
error_message: 'vhost_user_blk_server requires vhost-user support') 
\
   .disable_auto_if(not have_tools and not have_system) \
diff --git a/contrib/vhost-user-blk/vhost-user-blk.c 
b/contrib/vhost-user-blk/vhost-user-blk.c
index 9492146855..a450337685 100644
--- a/contrib/vhost-user-blk/vhost-user-blk.c
+++ b/contrib/vhost-user-blk/vhost-user-blk.c
@@ -25,6 +25,20 @@
 #include 
 #endif
 
+/* OS X does not have O_DSYNC */
+#ifndef O_DSYNC
+#ifdef O_SYNC
+#define O_DSYNC O_SYNC
+#elif defined(O_FSYNC)
+#define O_DSYNC O_FSYNC
+#endif
+#endif
+
+/* Approximate O_DIRECT with O_DSYNC if O_DIRECT isn't available */
+#ifndef O_DIRECT
+#define O_DIRECT O_DSYNC
+#endif
+
 enum {
 VHOST_USER_BLK_MAX_QUEUES = 8,
 };
diff --git a/util/meson.build b/util/meson.build
index 72b505df11..c414178ace 100644
--- a/util/meson.build
+++ b/util/meson.build
@@ -112,10 +112,12 @@ if have_block
 util_ss.add(files('filemonitor-stub.c'))
   endif
   if host_os == 'linux'
-util_ss.add(files('vhost-user-server.c'), vhost_user)
 util_ss.add(files('vfio-helpers.c'))
 util_ss.add(files('chardev_open.c'))
   endif
+  if host_os != 'windows'
+util_ss.add(files('vhost-user-server.c'), vhost_user)
+  endif
   util_ss.add(files('yank.c'))
 endif
 
-- 
2.45.1




[PATCH v6 03/12] libvhost-user: mask F_INFLIGHT_SHMFD if memfd is not supported

2024-05-28 Thread Stefano Garzarella
libvhost-user will panic when receiving VHOST_USER_GET_INFLIGHT_FD
message if MFD_ALLOW_SEALING is not defined, since it's not able
to create a memfd.

VHOST_USER_GET_INFLIGHT_FD is used only if
VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD is negotiated. So, let's mask
that feature if the backend is not able to properly handle these
messages.

Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Acked-by: Stefan Hajnoczi 
Reviewed-by: David Hildenbrand 
Signed-off-by: Stefano Garzarella 
---
 subprojects/libvhost-user/libvhost-user.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/subprojects/libvhost-user/libvhost-user.c 
b/subprojects/libvhost-user/libvhost-user.c
index a11afd1960..2c20cdc16e 100644
--- a/subprojects/libvhost-user/libvhost-user.c
+++ b/subprojects/libvhost-user/libvhost-user.c
@@ -1674,6 +1674,17 @@ vu_get_protocol_features_exec(VuDev *dev, VhostUserMsg 
*vmsg)
 features |= dev->iface->get_protocol_features(dev);
 }
 
+#ifndef MFD_ALLOW_SEALING
+/*
+ * If MFD_ALLOW_SEALING is not defined, we are not able to handle
+ * VHOST_USER_GET_INFLIGHT_FD messages, since we can't create a memfd.
+ * Those messages are used only if VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD
+ * is negotiated. A device implementation can enable it, so let's mask
+ * it to avoid a runtime panic.
+ */
+features &= ~(1ULL << VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD);
+#endif
+
 vmsg_set_reply_u64(vmsg, features);
 return true;
 }
-- 
2.45.1




[PATCH v6 04/12] vhost-user-server: do not set memory fd non-blocking

2024-05-28 Thread Stefano Garzarella
In vhost-user-server we set all fd received from the other peer
in non-blocking mode. For some of them (e.g. memfd, shm_open, etc.)
it's not really needed, because we don't use these fd with blocking
operations, but only to map memory.

In addition, in some systems this operation can fail (e.g. in macOS
setting an fd returned by shm_open() non-blocking fails with errno
= ENOTTY).

So, let's avoid setting fd non-blocking for those messages that we
know carry memory fd (e.g. VHOST_USER_ADD_MEM_REG,
VHOST_USER_SET_MEM_TABLE).

Reviewed-by: Daniel P. Berrangé 
Acked-by: Stefan Hajnoczi 
Reviewed-by: David Hildenbrand 
Signed-off-by: Stefano Garzarella 
---
v3:
- avoiding setting fd non-blocking for messages where we have memory fd
  (Eric)
---
 util/vhost-user-server.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
index 3bfb1ad3ec..b19229074a 100644
--- a/util/vhost-user-server.c
+++ b/util/vhost-user-server.c
@@ -65,6 +65,18 @@ static void vmsg_close_fds(VhostUserMsg *vmsg)
 static void vmsg_unblock_fds(VhostUserMsg *vmsg)
 {
 int i;
+
+/*
+ * These messages carry fd used to map memory, not to send/receive 
messages,
+ * so this operation is useless. In addition, in some systems this
+ * operation can fail (e.g. in macOS setting an fd returned by shm_open()
+ * non-blocking fails with errno = ENOTTY)
+ */
+if (vmsg->request == VHOST_USER_ADD_MEM_REG ||
+vmsg->request == VHOST_USER_SET_MEM_TABLE) {
+return;
+}
+
 for (i = 0; i < vmsg->fd_num; i++) {
 qemu_socket_set_nonblock(vmsg->fds[i]);
 }
-- 
2.45.1




[PATCH v6 05/12] contrib/vhost-user-blk: fix bind() using the right size of the address

2024-05-28 Thread Stefano Garzarella
On macOS passing `-s /tmp/vhost.socket` parameter to the vhost-user-blk
application, the bind was done on `/tmp/vhost.socke` pathname,
missing the last character.

This sounds like one of the portability problems described in the
unix(7) manpage:

   Pathname sockets
   When  binding  a socket to a pathname, a few rules should
   be observed for maximum portability and ease of coding:

   •  The pathname in sun_path should be null-terminated.

   •  The length of the pathname, including the  terminating
  null byte, should not exceed the size of sun_path.

   •  The  addrlen  argument  that  describes  the enclosing
  sockaddr_un structure should have a value of at least:

  offsetof(struct sockaddr_un, sun_path) +
  strlen(addr.sun_path)+1

  or,  more  simply,  addrlen  can   be   specified   as
  sizeof(struct sockaddr_un).

So let's follow the last advice and simplify the code as well.

Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Acked-by: Stefan Hajnoczi 
Reviewed-by: David Hildenbrand 
Signed-off-by: Stefano Garzarella 
---
 contrib/vhost-user-blk/vhost-user-blk.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/contrib/vhost-user-blk/vhost-user-blk.c 
b/contrib/vhost-user-blk/vhost-user-blk.c
index 89e5f11a64..a8ab9269a2 100644
--- a/contrib/vhost-user-blk/vhost-user-blk.c
+++ b/contrib/vhost-user-blk/vhost-user-blk.c
@@ -469,7 +469,6 @@ static int unix_sock_new(char *unix_fn)
 {
 int sock;
 struct sockaddr_un un;
-size_t len;
 
 assert(unix_fn);
 
@@ -481,10 +480,9 @@ static int unix_sock_new(char *unix_fn)
 
 un.sun_family = AF_UNIX;
 (void)snprintf(un.sun_path, sizeof(un.sun_path), "%s", unix_fn);
-len = sizeof(un.sun_family) + strlen(un.sun_path);
 
 (void)unlink(unix_fn);
-if (bind(sock, (struct sockaddr *), len) < 0) {
+if (bind(sock, (struct sockaddr *), sizeof(un)) < 0) {
 perror("bind");
 goto fail;
 }
-- 
2.45.1




[PATCH v6 08/12] libvhost-user: enable it on any POSIX system

2024-05-28 Thread Stefano Garzarella
The vhost-user protocol is not really Linux-specific so let's enable
libvhost-user for any POSIX system.

Compiling it on macOS and FreeBSD some problems came up:
- avoid to include linux/vhost.h which is available only on Linux
  (vhost_types.h contains many of the things we need)
- macOS doesn't provide sys/endian.h, so let's define them
  (note: libvhost-user doesn't include QEMU's headers, so we can't use
   use "qemu/bswap.h")
- define eventfd_[write|read] as write/read wrapper when system doesn't
  provide those (e.g. macOS)
- copy SEAL defines from include/qemu/memfd.h to make the code works
  on FreeBSD where MFD_ALLOW_SEALING is defined
- define MAP_NORESERVE if it's not defined (e.g. on FreeBSD)

Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Acked-by: Stefan Hajnoczi 
Signed-off-by: Stefano Garzarella 
---
v5:
- fixed typos in the commit description [Phil]
---
 meson.build   |  2 +-
 subprojects/libvhost-user/libvhost-user.h |  2 +-
 subprojects/libvhost-user/libvhost-user.c | 60 +--
 3 files changed, 59 insertions(+), 5 deletions(-)

diff --git a/meson.build b/meson.build
index a72500be77..48e476b237 100644
--- a/meson.build
+++ b/meson.build
@@ -3162,7 +3162,7 @@ if have_system and vfio_user_server_allowed
 endif
 
 vhost_user = not_found
-if host_os == 'linux' and have_vhost_user
+if have_vhost_user
   libvhost_user = subproject('libvhost-user')
   vhost_user = libvhost_user.get_variable('vhost_user_dep')
 endif
diff --git a/subprojects/libvhost-user/libvhost-user.h 
b/subprojects/libvhost-user/libvhost-user.h
index deb40e77b3..e13e1d3931 100644
--- a/subprojects/libvhost-user/libvhost-user.h
+++ b/subprojects/libvhost-user/libvhost-user.h
@@ -18,9 +18,9 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include "standard-headers/linux/virtio_ring.h"
+#include "standard-headers/linux/vhost_types.h"
 
 /* Based on qemu/hw/virtio/vhost-user.c */
 #define VHOST_USER_F_PROTOCOL_FEATURES 30
diff --git a/subprojects/libvhost-user/libvhost-user.c 
b/subprojects/libvhost-user/libvhost-user.c
index 2c20cdc16e..57e58d4adb 100644
--- a/subprojects/libvhost-user/libvhost-user.c
+++ b/subprojects/libvhost-user/libvhost-user.c
@@ -28,9 +28,7 @@
 #include 
 #include 
 #include 
-#include 
 #include 
-#include 
 
 /* Necessary to provide VIRTIO_F_VERSION_1 on system
  * with older linux headers. Must appear before
@@ -39,8 +37,8 @@
 #include "standard-headers/linux/virtio_config.h"
 
 #if defined(__linux__)
+#include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -52,6 +50,62 @@
 
 #endif
 
+#if defined(__APPLE__) && (__MACH__)
+#include 
+#define htobe16(x) OSSwapHostToBigInt16(x)
+#define htole16(x) OSSwapHostToLittleInt16(x)
+#define be16toh(x) OSSwapBigToHostInt16(x)
+#define le16toh(x) OSSwapLittleToHostInt16(x)
+
+#define htobe32(x) OSSwapHostToBigInt32(x)
+#define htole32(x) OSSwapHostToLittleInt32(x)
+#define be32toh(x) OSSwapBigToHostInt32(x)
+#define le32toh(x) OSSwapLittleToHostInt32(x)
+
+#define htobe64(x) OSSwapHostToBigInt64(x)
+#define htole64(x) OSSwapHostToLittleInt64(x)
+#define be64toh(x) OSSwapBigToHostInt64(x)
+#define le64toh(x) OSSwapLittleToHostInt64(x)
+#endif
+
+#ifdef CONFIG_EVENTFD
+#include 
+#else
+#define eventfd_t uint64_t
+
+int eventfd_write(int fd, eventfd_t value)
+{
+return (write(fd, , sizeof(value)) == sizeof(value)) ? 0 : -1;
+}
+
+int eventfd_read(int fd, eventfd_t *value)
+{
+return (read(fd, value, sizeof(*value)) == sizeof(*value)) ? 0 : -1;
+}
+#endif
+
+#ifdef MFD_ALLOW_SEALING
+#include 
+
+#ifndef F_LINUX_SPECIFIC_BASE
+#define F_LINUX_SPECIFIC_BASE 1024
+#endif
+
+#ifndef F_ADD_SEALS
+#define F_ADD_SEALS (F_LINUX_SPECIFIC_BASE + 9)
+#define F_GET_SEALS (F_LINUX_SPECIFIC_BASE + 10)
+
+#define F_SEAL_SEAL 0x0001  /* prevent further seals from being set */
+#define F_SEAL_SHRINK   0x0002  /* prevent file from shrinking */
+#define F_SEAL_GROW 0x0004  /* prevent file from growing */
+#define F_SEAL_WRITE0x0008  /* prevent writes */
+#endif
+#endif
+
+#ifndef MAP_NORESERVE
+#define MAP_NORESERVE 0
+#endif
+
 #include "include/atomic.h"
 
 #include "libvhost-user.h"
-- 
2.45.1




[PATCH v6 06/12] contrib/vhost-user-*: use QEMU bswap helper functions

2024-05-28 Thread Stefano Garzarella
Let's replace the calls to le*toh() and htole*() with qemu/bswap.h
helpers to make the code more portable.

Suggested-by: Philippe Mathieu-Daudé 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Acked-by: Stefan Hajnoczi 
Reviewed-by: David Hildenbrand 
Signed-off-by: Stefano Garzarella 
---
 contrib/vhost-user-blk/vhost-user-blk.c |  9 +
 contrib/vhost-user-input/main.c | 16 
 2 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/contrib/vhost-user-blk/vhost-user-blk.c 
b/contrib/vhost-user-blk/vhost-user-blk.c
index a8ab9269a2..9492146855 100644
--- a/contrib/vhost-user-blk/vhost-user-blk.c
+++ b/contrib/vhost-user-blk/vhost-user-blk.c
@@ -16,6 +16,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/bswap.h"
 #include "standard-headers/linux/virtio_blk.h"
 #include "libvhost-user-glib.h"
 
@@ -194,8 +195,8 @@ vub_discard_write_zeroes(VubReq *req, struct iovec *iov, 
uint32_t iovcnt,
 #if defined(__linux__) && defined(BLKDISCARD) && defined(BLKZEROOUT)
 VubDev *vdev_blk = req->vdev_blk;
 desc = buf;
-uint64_t range[2] = { le64toh(desc->sector) << 9,
-  le32toh(desc->num_sectors) << 9 };
+uint64_t range[2] = { le64_to_cpu(desc->sector) << 9,
+  le32_to_cpu(desc->num_sectors) << 9 };
 if (type == VIRTIO_BLK_T_DISCARD) {
 if (ioctl(vdev_blk->blk_fd, BLKDISCARD, range) == 0) {
 g_free(buf);
@@ -267,13 +268,13 @@ static int vub_virtio_process_req(VubDev *vdev_blk,
 req->in = (struct virtio_blk_inhdr *)elem->in_sg[in_num - 1].iov_base;
 in_num--;
 
-type = le32toh(req->out->type);
+type = le32_to_cpu(req->out->type);
 switch (type & ~VIRTIO_BLK_T_BARRIER) {
 case VIRTIO_BLK_T_IN:
 case VIRTIO_BLK_T_OUT: {
 ssize_t ret = 0;
 bool is_write = type & VIRTIO_BLK_T_OUT;
-req->sector_num = le64toh(req->out->sector);
+req->sector_num = le64_to_cpu(req->out->sector);
 if (is_write) {
 ret  = vub_writev(req, >out_sg[1], out_num);
 } else {
diff --git a/contrib/vhost-user-input/main.c b/contrib/vhost-user-input/main.c
index 081230da54..f3362d41ac 100644
--- a/contrib/vhost-user-input/main.c
+++ b/contrib/vhost-user-input/main.c
@@ -51,8 +51,8 @@ static void vi_input_send(VuInput *vi, struct 
virtio_input_event *event)
 vi->queue[vi->qindex++].event = *event;
 
 /* ... until we see a report sync ... */
-if (event->type != htole16(EV_SYN) ||
-event->code != htole16(SYN_REPORT)) {
+if (event->type != cpu_to_le16(EV_SYN) ||
+event->code != cpu_to_le16(SYN_REPORT)) {
 return;
 }
 
@@ -103,9 +103,9 @@ vi_evdev_watch(VuDev *dev, int condition, void *data)
 
 g_debug("input %d %d %d", evdev.type, evdev.code, evdev.value);
 
-virtio.type  = htole16(evdev.type);
-virtio.code  = htole16(evdev.code);
-virtio.value = htole32(evdev.value);
+virtio.type  = cpu_to_le16(evdev.type);
+virtio.code  = cpu_to_le16(evdev.code);
+virtio.value = cpu_to_le32(evdev.value);
 vi_input_send(vi, );
 }
 }
@@ -124,9 +124,9 @@ static void vi_handle_status(VuInput *vi, 
virtio_input_event *event)
 
 evdev.input_event_sec = tval.tv_sec;
 evdev.input_event_usec = tval.tv_usec;
-evdev.type = le16toh(event->type);
-evdev.code = le16toh(event->code);
-evdev.value = le32toh(event->value);
+evdev.type = le16_to_cpu(event->type);
+evdev.code = le16_to_cpu(event->code);
+evdev.value = le32_to_cpu(event->value);
 
 rc = write(vi->evdevfd, , sizeof(evdev));
 if (rc == -1) {
-- 
2.45.1




[PATCH v6 01/12] libvhost-user: set msg.msg_control to NULL when it is empty

2024-05-28 Thread Stefano Garzarella
On some OS (e.g. macOS) sendmsg() returns -1 (errno EINVAL) if
the `struct msghdr` has the field `msg_controllen` set to 0, but
`msg_control` is not NULL.

Reviewed-by: Eric Blake 
Reviewed-by: David Hildenbrand 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Acked-by: Stefan Hajnoczi 
Signed-off-by: Stefano Garzarella 
---
 subprojects/libvhost-user/libvhost-user.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/subprojects/libvhost-user/libvhost-user.c 
b/subprojects/libvhost-user/libvhost-user.c
index a879149fef..22bea0c775 100644
--- a/subprojects/libvhost-user/libvhost-user.c
+++ b/subprojects/libvhost-user/libvhost-user.c
@@ -632,6 +632,7 @@ vu_message_write(VuDev *dev, int conn_fd, VhostUserMsg 
*vmsg)
 memcpy(CMSG_DATA(cmsg), vmsg->fds, fdsize);
 } else {
 msg.msg_controllen = 0;
+msg.msg_control = NULL;
 }
 
 do {
-- 
2.45.1




[PATCH v6 00/12] vhost-user: support any POSIX system (tested on macOS, FreeBSD, OpenBSD)

2024-05-28 Thread Stefano Garzarella
v1: https://patchew.org/QEMU/20240228114759.44758-1-sgarz...@redhat.com/
v2: https://patchew.org/QEMU/20240326133936.125332-1-sgarz...@redhat.com/
v3: https://patchew.org/QEMU/20240404122330.92710-1-sgarz...@redhat.com/
v4: https://patchew.org/QEMU/20240508074457.12367-1-sgarz...@redhat.com/
v5: https://patchew.org/QEMU/20240523145522.313012-1-sgarz...@redhat.com/
v6:
- rebased on 60b54b67c63d8f076152e0f7dccf39854dfc6a77
- added David R-b tags [thanks!]
- patch 9 (was split in 9 & 10 in v5): reverted v5 changes since we can't
  move O_DSYNC and O_DIRECT in osdep [Daniel, failing tests on Windows]
- patch 11: removed `share=on` since it's the default [David]
- the series is now fully acked/reviewed

The vhost-user protocol is not really Linux-specific, so let's try support
QEMU's frontends and backends (including libvhost-user) in any POSIX system
with this series. The main use case is to be able to use virtio devices that
we don't have built-in in QEMU (e.g. virtiofsd, vhost-user-vsock, etc.) even
in non-Linux systems.

The first 5 patches are more like fixes discovered at runtime on macOS or
FreeBSD that could go even independently of this series.

Patches 6, 7, 8, 9 enable building of frontends and backends (including
libvhost-user) with associated code changes to succeed in compilation.

Patch 10 adds `memory-backend-shm` that uses the POSIX shm_open() API to
create shared memory which is identified by an fd that can be shared with
vhost-user backends. This is useful on those systems (like macOS) where
we don't have memfd_create() or special filesystems like "/dev/shm".

Patches 11 and 12 use `memory-backend-shm` in some vhost-user tests.

Maybe the first 5 patches can go separately, but I only discovered those
problems after testing patches 6 - 9, so I have included them in this series
for now. Please let me know if you prefer that I send them separately.

I tested this series using vhost-user-blk and QSD on macOS Sonoma 14.4
(aarch64), FreeBSD 14 (x86_64), OpenBSD 7.4 (x86_64), and Fedora 40 (x86_64)
in this way:

- Start vhost-user-blk or QSD (same commands for all systems)

  vhost-user-blk -s /tmp/vhost.socket \
-b Fedora-Cloud-Base-39-1.5.x86_64.raw

  qemu-storage-daemon \
--blockdev 
file,filename=Fedora-Cloud-Base-39-1.5.x86_64.qcow2,node-name=file \
--blockdev qcow2,file=file,node-name=qcow2 \
--export 
vhost-user-blk,addr.type=unix,addr.path=/tmp/vhost.socket,id=vub,num-queues=1,node-name=qcow2,writable=on

- macOS (aarch64): start QEMU (using hvf accelerator)

  qemu-system-aarch64 -smp 2 -cpu host -M virt,accel=hvf,memory-backend=mem \
-drive 
file=./build/pc-bios/edk2-aarch64-code.fd,if=pflash,format=raw,readonly=on \
-device virtio-net-device,netdev=net0 -netdev user,id=net0 \
-device ramfb -device usb-ehci -device usb-kbd \
-object memory-backend-shm,id=mem,size=512M \
-device vhost-user-blk-pci,num-queues=1,disable-legacy=on,chardev=char0 \
-chardev socket,id=char0,path=/tmp/vhost.socket

- FreeBSD/OpenBSD (x86_64): start QEMU (no accelerators available)

  qemu-system-x86_64 -smp 2 -M q35,memory-backend=mem \
-object memory-backend-shm,id=mem,size="512M" \
-device vhost-user-blk-pci,num-queues=1,chardev=char0 \
-chardev socket,id=char0,path=/tmp/vhost.socket

- Fedora (x86_64): start QEMU (using kvm accelerator)

  qemu-system-x86_64 -smp 2 -M q35,accel=kvm,memory-backend=mem \
-object memory-backend-shm,size="512M" \
-device vhost-user-blk-pci,num-queues=1,chardev=char0 \
-chardev socket,id=char0,path=/tmp/vhost.socket

Branch pushed (and CI started) at 
https://gitlab.com/sgarzarella/qemu/-/tree/macos-vhost-user?ref_type=heads

Thanks,
Stefano

Stefano Garzarella (12):
  libvhost-user: set msg.msg_control to NULL when it is empty
  libvhost-user: fail vu_message_write() if sendmsg() is failing
  libvhost-user: mask F_INFLIGHT_SHMFD if memfd is not supported
  vhost-user-server: do not set memory fd non-blocking
  contrib/vhost-user-blk: fix bind() using the right size of the address
  contrib/vhost-user-*: use QEMU bswap helper functions
  vhost-user: enable frontends on any POSIX system
  libvhost-user: enable it on any POSIX system
  contrib/vhost-user-blk: enable it on any POSIX system
  hostmem: add a new memory backend based on POSIX shm_open()
  tests/qtest/vhost-user-blk-test: use memory-backend-shm
  tests/qtest/vhost-user-test: add a test case for memory-backend-shm

 docs/system/devices/vhost-user.rst|   5 +-
 meson.build   |   5 +-
 qapi/qom.json |  19 
 subprojects/libvhost-user/libvhost-user.h |   2 +-
 backends/hostmem-shm.c| 123 ++
 contrib/vhost-user-blk/vhost-user-blk.c   |  27 +++--
 contrib/vhost-user-input/main.c   |  16 +--
 hw/net/vhost_net.c|   5 +
 subprojects/libvhost-user/libvhost-user.c |  77 +-
 tests/qtest/v

[PATCH v6 02/12] libvhost-user: fail vu_message_write() if sendmsg() is failing

2024-05-28 Thread Stefano Garzarella
In vu_message_write() we use sendmsg() to send the message header,
then a write() to send the payload.

If sendmsg() fails we should avoid sending the payload, since we
were unable to send the header.

Discovered before fixing the issue with the previous patch, where
sendmsg() failed on macOS due to wrong parameters, but the frontend
still sent the payload which the backend incorrectly interpreted
as a wrong header.

Reviewed-by: Daniel P. Berrangé 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Acked-by: Stefan Hajnoczi 
Reviewed-by: David Hildenbrand 
Signed-off-by: Stefano Garzarella 
---
 subprojects/libvhost-user/libvhost-user.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/subprojects/libvhost-user/libvhost-user.c 
b/subprojects/libvhost-user/libvhost-user.c
index 22bea0c775..a11afd1960 100644
--- a/subprojects/libvhost-user/libvhost-user.c
+++ b/subprojects/libvhost-user/libvhost-user.c
@@ -639,6 +639,11 @@ vu_message_write(VuDev *dev, int conn_fd, VhostUserMsg 
*vmsg)
 rc = sendmsg(conn_fd, , 0);
 } while (rc < 0 && (errno == EINTR || errno == EAGAIN));
 
+if (rc <= 0) {
+vu_panic(dev, "Error while writing: %s", strerror(errno));
+return false;
+}
+
 if (vmsg->size) {
 do {
 if (vmsg->data) {
-- 
2.45.1




[PATCH v6 07/12] vhost-user: enable frontends on any POSIX system

2024-05-28 Thread Stefano Garzarella
The vhost-user protocol is not really Linux-specific so let's enable
vhost-user frontends for any POSIX system.

In vhost_net.c we use VHOST_FILE_UNBIND which is defined in a Linux
specific header, let's define it for other systems as well.

Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Acked-by: Stefan Hajnoczi 
Reviewed-by: David Hildenbrand 
Signed-off-by: Stefano Garzarella 
---
 meson.build| 1 -
 hw/net/vhost_net.c | 5 +
 hw/block/Kconfig   | 2 +-
 3 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/meson.build b/meson.build
index 6386607144..a72500be77 100644
--- a/meson.build
+++ b/meson.build
@@ -151,7 +151,6 @@ have_tpm = get_option('tpm') \
 
 # vhost
 have_vhost_user = get_option('vhost_user') \
-  .disable_auto_if(host_os != 'linux') \
   .require(host_os != 'windows',
error_message: 'vhost-user is not available on Windows').allowed()
 have_vhost_vdpa = get_option('vhost_vdpa') \
diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index fd1a93701a..fced429813 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -34,8 +34,13 @@
 #include "standard-headers/linux/virtio_ring.h"
 #include "hw/virtio/vhost.h"
 #include "hw/virtio/virtio-bus.h"
+#if defined(__linux__)
 #include "linux-headers/linux/vhost.h"
+#endif
 
+#ifndef VHOST_FILE_UNBIND
+#define VHOST_FILE_UNBIND -1
+#endif
 
 /* Features supported by host kernel. */
 static const int kernel_feature_bits[] = {
diff --git a/hw/block/Kconfig b/hw/block/Kconfig
index 9e8f28f982..29ee09e434 100644
--- a/hw/block/Kconfig
+++ b/hw/block/Kconfig
@@ -40,7 +40,7 @@ config VHOST_USER_BLK
 bool
 # Only PCI devices are provided for now
 default y if VIRTIO_PCI
-depends on VIRTIO && VHOST_USER && LINUX
+depends on VIRTIO && VHOST_USER
 
 config SWIM
 bool
-- 
2.45.1




Re: [PATCH v5 09/13] osdep: move O_DSYNC and O_DIRECT defines from file-posix

2024-05-23 Thread Stefano Garzarella

On Thu, May 23, 2024 at 04:55:18PM GMT, Stefano Garzarella wrote:

These defines are also useful for vhost-user-blk when it is compiled
in some POSIX systems that do not define them, so let's move them to
“qemu/osdep.h”.

Suggested-by: Philippe Mathieu-Daudé 
Signed-off-by: Stefano Garzarella 
---
include/qemu/osdep.h | 14 ++
block/file-posix.c   | 14 --
2 files changed, 14 insertions(+), 14 deletions(-)


This seems to break the compilation on win64: 
https://gitlab.com/sgarzarella/qemu/-/jobs/6923403322


In file included from ../util/osdep.c:24:
../util/osdep.c: In function 'qemu_open_internal':
../include/qemu/osdep.h:339:18: error: 'O_DSYNC' undeclared (first use in 
this function)
  339 | #define O_DIRECT O_DSYNC
  |  ^~~
../util/osdep.c:334:41: note: in expansion of macro 'O_DIRECT'
  334 | if (errno == EINVAL && (flags & O_DIRECT)) {
  | ^~~~
../include/qemu/osdep.h:339:18: note: each undeclared identifier is 
reported only once for each function it appears in
  339 | #define O_DIRECT O_DSYNC
  |  ^~~
../util/osdep.c:334:41: note: in expansion of macro 'O_DIRECT'
  334 | if (errno == EINVAL && (flags & O_DIRECT)) {
  | ^~~~
../util/osdep.c: In function 'qemu_open_old':
../include/qemu/osdep.h:339:18: error: 'O_DSYNC' undeclared (first use in 
this function)
  339 | #define O_DIRECT O_DSYNC
  |  ^~~
../util/osdep.c:385:50: note: in expansion of macro 'O_DIRECT'
  385 | if (ret == -1 && errno == EINVAL && (flags & O_DIRECT)) {
  |

Indeed file-posix.c was not compiled on windows. Oops, I didn't think of 
that :-(


I'm thinking on putting a guard on CONFIG_POSIX, or just checking that 
O_DSYNC is defined. Any suggestion?


Thanks,
Stefano



diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
index f61edcfdc2..e165b5cb1b 100644
--- a/include/qemu/osdep.h
+++ b/include/qemu/osdep.h
@@ -325,6 +325,20 @@ void QEMU_ERROR("code path is reachable")
#define ESHUTDOWN 4099
#endif

+/* OS X does not have O_DSYNC */
+#ifndef O_DSYNC
+#ifdef O_SYNC
+#define O_DSYNC O_SYNC
+#elif defined(O_FSYNC)
+#define O_DSYNC O_FSYNC
+#endif
+#endif
+
+/* Approximate O_DIRECT with O_DSYNC if O_DIRECT isn't available */
+#ifndef O_DIRECT
+#define O_DIRECT O_DSYNC
+#endif
+
#define RETRY_ON_EINTR(expr) \
(__extension__  \
({ typeof(expr) __result;   \
diff --git a/block/file-posix.c b/block/file-posix.c
index 35684f7e21..7a196a2abf 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -110,20 +110,6 @@
#include 
#endif

-/* OS X does not have O_DSYNC */
-#ifndef O_DSYNC
-#ifdef O_SYNC
-#define O_DSYNC O_SYNC
-#elif defined(O_FSYNC)
-#define O_DSYNC O_FSYNC
-#endif
-#endif
-
-/* Approximate O_DIRECT with O_DSYNC if O_DIRECT isn't available */
-#ifndef O_DIRECT
-#define O_DIRECT O_DSYNC
-#endif
-
#define FTYPE_FILE   0
#define FTYPE_CD 1

--
2.45.1






Re: [PATCH v5 12/13] tests/qtest/vhost-user-blk-test: use memory-backend-shm

2024-05-23 Thread Stefano Garzarella

On Thu, May 23, 2024 at 05:06:00PM GMT, David Hildenbrand wrote:

On 23.05.24 16:55, Stefano Garzarella wrote:

`memory-backend-memfd` is available only on Linux while the new
`memory-backend-shm` can be used on any POSIX-compliant operating
system. Let's use it so we can run the test in multiple environments.

Acked-by: Thomas Huth 
Acked-by: Stefan Hajnoczi 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Signed-off-by: Stefano Garzarella 
---
 tests/qtest/vhost-user-blk-test.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qtest/vhost-user-blk-test.c 
b/tests/qtest/vhost-user-blk-test.c
index 117b9acd10..e945f6abf2 100644
--- a/tests/qtest/vhost-user-blk-test.c
+++ b/tests/qtest/vhost-user-blk-test.c
@@ -906,7 +906,7 @@ static void start_vhost_user_blk(GString *cmd_line, int 
vus_instances,
vhost_user_blk_bin);
 g_string_append_printf(cmd_line,
-" -object memory-backend-memfd,id=mem,size=256M,share=on "
+" -object memory-backend-shm,id=mem,size=256M,share=on "


Can we simplifya nd drop the share=on?


Good catch! I'll do in the next version!



Reviewed-by: David Hildenbrand 


Thanks for the reviews,
Stefano




Re: [PATCH v5 09/13] osdep: move O_DSYNC and O_DIRECT defines from file-posix

2024-05-23 Thread Stefano Garzarella

On Thu, May 23, 2024 at 04:14:48PM GMT, Daniel P. Berrangé wrote:

On Thu, May 23, 2024 at 04:55:18PM +0200, Stefano Garzarella wrote:

These defines are also useful for vhost-user-blk when it is compiled
in some POSIX systems that do not define them, so let's move them to
“qemu/osdep.h”.

Suggested-by: Philippe Mathieu-Daudé 
Signed-off-by: Stefano Garzarella 
---
 include/qemu/osdep.h | 14 ++
 block/file-posix.c   | 14 --
 2 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
index f61edcfdc2..e165b5cb1b 100644
--- a/include/qemu/osdep.h
+++ b/include/qemu/osdep.h
@@ -325,6 +325,20 @@ void QEMU_ERROR("code path is reachable")
 #define ESHUTDOWN 4099
 #endif

+/* OS X does not have O_DSYNC */
+#ifndef O_DSYNC
+#ifdef O_SYNC
+#define O_DSYNC O_SYNC
+#elif defined(O_FSYNC)
+#define O_DSYNC O_FSYNC
+#endif
+#endif
+
+/* Approximate O_DIRECT with O_DSYNC if O_DIRECT isn't available */
+#ifndef O_DIRECT
+#define O_DIRECT O_DSYNC
+#endif


Please don't do this - we can't be confident that all code in
QEMU will be OK with O_DIRECT being simulated in this way.

I'm not convinced that the O_DSYNC simulation is a good idea
to do tree-wide either.


I was a little scared, and you and the failing tests on win64 convinced 
me to bring this back as in v4 ;-)


Thanks,
Stefano




[PATCH v5 05/13] contrib/vhost-user-blk: fix bind() using the right size of the address

2024-05-23 Thread Stefano Garzarella
On macOS passing `-s /tmp/vhost.socket` parameter to the vhost-user-blk
application, the bind was done on `/tmp/vhost.socke` pathname,
missing the last character.

This sounds like one of the portability problems described in the
unix(7) manpage:

   Pathname sockets
   When  binding  a socket to a pathname, a few rules should
   be observed for maximum portability and ease of coding:

   •  The pathname in sun_path should be null-terminated.

   •  The length of the pathname, including the  terminating
  null byte, should not exceed the size of sun_path.

   •  The  addrlen  argument  that  describes  the enclosing
  sockaddr_un structure should have a value of at least:

  offsetof(struct sockaddr_un, sun_path) +
  strlen(addr.sun_path)+1

  or,  more  simply,  addrlen  can   be   specified   as
  sizeof(struct sockaddr_un).

So let's follow the last advice and simplify the code as well.

Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Acked-by: Stefan Hajnoczi 
Signed-off-by: Stefano Garzarella 
---
 contrib/vhost-user-blk/vhost-user-blk.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/contrib/vhost-user-blk/vhost-user-blk.c 
b/contrib/vhost-user-blk/vhost-user-blk.c
index 89e5f11a64..a8ab9269a2 100644
--- a/contrib/vhost-user-blk/vhost-user-blk.c
+++ b/contrib/vhost-user-blk/vhost-user-blk.c
@@ -469,7 +469,6 @@ static int unix_sock_new(char *unix_fn)
 {
 int sock;
 struct sockaddr_un un;
-size_t len;
 
 assert(unix_fn);
 
@@ -481,10 +480,9 @@ static int unix_sock_new(char *unix_fn)
 
 un.sun_family = AF_UNIX;
 (void)snprintf(un.sun_path, sizeof(un.sun_path), "%s", unix_fn);
-len = sizeof(un.sun_family) + strlen(un.sun_path);
 
 (void)unlink(unix_fn);
-if (bind(sock, (struct sockaddr *), len) < 0) {
+if (bind(sock, (struct sockaddr *), sizeof(un)) < 0) {
 perror("bind");
 goto fail;
 }
-- 
2.45.1




[PATCH v5 11/13] hostmem: add a new memory backend based on POSIX shm_open()

2024-05-23 Thread Stefano Garzarella
shm_open() creates and opens a new POSIX shared memory object.
A POSIX shared memory object allows creating memory backend with an
associated file descriptor that can be shared with external processes
(e.g. vhost-user).

The new `memory-backend-shm` can be used as an alternative when
`memory-backend-memfd` is not available (Linux only), since shm_open()
should be provided by any POSIX-compliant operating system.

This backend mimics memfd, allocating memory that is practically
anonymous. In theory shm_open() requires a name, but this is allocated
for a short time interval and shm_unlink() is called right after
shm_open(). After that, only fd is shared with external processes
(e.g., vhost-user) as if it were associated with anonymous memory.

In the future we may also allow the user to specify the name to be
passed to shm_open(), but for now we keep the backend simple, mimicking
anonymous memory such as memfd.

Acked-by: David Hildenbrand 
Acked-by: Stefan Hajnoczi 
Signed-off-by: Stefano Garzarella 
---
v5
- fixed documentation in qapi/qom.json and qemu-options.hx [Markus]
v4
- fail if we find "share=off" in shm_backend_memory_alloc() [David]
v3
- enriched commit message and documentation to highlight that we
  want to mimic memfd (David)
---
 docs/system/devices/vhost-user.rst |   5 +-
 qapi/qom.json  |  19 +
 backends/hostmem-shm.c | 123 +
 backends/meson.build   |   1 +
 qemu-options.hx|  16 
 5 files changed, 162 insertions(+), 2 deletions(-)
 create mode 100644 backends/hostmem-shm.c

diff --git a/docs/system/devices/vhost-user.rst 
b/docs/system/devices/vhost-user.rst
index 9b2da106ce..35259d8ec7 100644
--- a/docs/system/devices/vhost-user.rst
+++ b/docs/system/devices/vhost-user.rst
@@ -98,8 +98,9 @@ Shared memory object
 
 In order for the daemon to access the VirtIO queues to process the
 requests it needs access to the guest's address space. This is
-achieved via the ``memory-backend-file`` or ``memory-backend-memfd``
-objects. A reference to a file-descriptor which can access this object
+achieved via the ``memory-backend-file``, ``memory-backend-memfd``, or
+``memory-backend-shm`` objects.
+A reference to a file-descriptor which can access this object
 will be passed via the socket as part of the protocol negotiation.
 
 Currently the shared memory object needs to match the size of the main
diff --git a/qapi/qom.json b/qapi/qom.json
index 38dde6d785..d40592d863 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -721,6 +721,21 @@
 '*hugetlbsize': 'size',
 '*seal': 'bool' } }
 
+##
+# @MemoryBackendShmProperties:
+#
+# Properties for memory-backend-shm objects.
+#
+# The @share boolean option is true by default with shm. Setting it to false
+# will cause a failure during allocation because it is not supported by this
+# backend.
+#
+# Since: 9.1
+##
+{ 'struct': 'MemoryBackendShmProperties',
+  'base': 'MemoryBackendProperties',
+  'data': { } }
+
 ##
 # @MemoryBackendEpcProperties:
 #
@@ -985,6 +1000,8 @@
 { 'name': 'memory-backend-memfd',
   'if': 'CONFIG_LINUX' },
 'memory-backend-ram',
+{ 'name': 'memory-backend-shm',
+  'if': 'CONFIG_POSIX' },
 'pef-guest',
 { 'name': 'pr-manager-helper',
   'if': 'CONFIG_LINUX' },
@@ -1056,6 +1073,8 @@
   'memory-backend-memfd':   { 'type': 'MemoryBackendMemfdProperties',
   'if': 'CONFIG_LINUX' },
   'memory-backend-ram': 'MemoryBackendProperties',
+  'memory-backend-shm': { 'type': 'MemoryBackendShmProperties',
+  'if': 'CONFIG_POSIX' },
   'pr-manager-helper':  { 'type': 'PrManagerHelperProperties',
   'if': 'CONFIG_LINUX' },
   'qtest':  'QtestProperties',
diff --git a/backends/hostmem-shm.c b/backends/hostmem-shm.c
new file mode 100644
index 00..374edc3db8
--- /dev/null
+++ b/backends/hostmem-shm.c
@@ -0,0 +1,123 @@
+/*
+ * QEMU host POSIX shared memory object backend
+ *
+ * Copyright (C) 2024 Red Hat Inc
+ *
+ * Authors:
+ *   Stefano Garzarella 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "sysemu/hostmem.h"
+#include "qapi/error.h"
+
+#define TYPE_MEMORY_BACKEND_SHM "memory-backend-shm"
+
+OBJECT_DECLARE_SIMPLE_TYPE(HostMemoryBackendShm, MEMORY_BACKEND_SHM)
+
+struct HostMemoryBackendShm {
+HostMemoryBackend parent_obj;
+};
+
+static bool
+shm_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
+{
+g_autoptr(GString) shm_name = g_string_new(NULL);
+g_autofree char *backend_name = NULL;
+uint32_t ram_flags;
+int fd, oflag;
+mode_t mode;
+
+if (!backend->size) {
+error_setg(errp, "can

[PATCH v5 10/13] contrib/vhost-user-blk: enable it on any POSIX system

2024-05-23 Thread Stefano Garzarella
Previous patches made the vhost-user-blk application and the
vhost-user-server.c dependency buildable for any POSIX system.

Acked-by: Stefan Hajnoczi 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Signed-off-by: Stefano Garzarella 
---
v5:
- O_DSYNC and O_DIRECT definition are now in osdep [Phil]
- commit updated since we moved out all code changes
v4:
- moved using of "qemu/bswap.h" API in a separate patch [Phil]
---
 meson.build  | 2 --
 util/meson.build | 4 +++-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/meson.build b/meson.build
index 543105af2a..06a1835a09 100644
--- a/meson.build
+++ b/meson.build
@@ -1974,8 +1974,6 @@ has_statx = cc.has_header_symbol('sys/stat.h', 
'STATX_BASIC_STATS', prefix: gnu_
 has_statx_mnt_id = cc.has_header_symbol('sys/stat.h', 'STATX_MNT_ID', prefix: 
gnu_source_prefix)
 
 have_vhost_user_blk_server = get_option('vhost_user_blk_server') \
-  .require(host_os == 'linux',
-   error_message: 'vhost_user_blk_server requires linux') \
   .require(have_vhost_user,
error_message: 'vhost_user_blk_server requires vhost-user support') 
\
   .disable_auto_if(not have_tools and not have_system) \
diff --git a/util/meson.build b/util/meson.build
index 72b505df11..c414178ace 100644
--- a/util/meson.build
+++ b/util/meson.build
@@ -112,10 +112,12 @@ if have_block
 util_ss.add(files('filemonitor-stub.c'))
   endif
   if host_os == 'linux'
-util_ss.add(files('vhost-user-server.c'), vhost_user)
 util_ss.add(files('vfio-helpers.c'))
 util_ss.add(files('chardev_open.c'))
   endif
+  if host_os != 'windows'
+util_ss.add(files('vhost-user-server.c'), vhost_user)
+  endif
   util_ss.add(files('yank.c'))
 endif
 
-- 
2.45.1




[PATCH v5 13/13] tests/qtest/vhost-user-test: add a test case for memory-backend-shm

2024-05-23 Thread Stefano Garzarella
`memory-backend-shm` can be used with vhost-user devices, so let's
add a new test case for it.

Acked-by: Thomas Huth 
Acked-by: Stefan Hajnoczi 
Signed-off-by: Stefano Garzarella 
---
 tests/qtest/vhost-user-test.c | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/tests/qtest/vhost-user-test.c b/tests/qtest/vhost-user-test.c
index d4e437265f..8c1d903b2a 100644
--- a/tests/qtest/vhost-user-test.c
+++ b/tests/qtest/vhost-user-test.c
@@ -44,6 +44,8 @@
 "mem-path=%s,share=on -numa node,memdev=mem"
 #define QEMU_CMD_MEMFD  " -m %d -object memory-backend-memfd,id=mem,size=%dM," 
\
 " -numa node,memdev=mem"
+#define QEMU_CMD_SHM" -m %d -object memory-backend-shm,id=mem,size=%dM," \
+" -numa node,memdev=mem"
 #define QEMU_CMD_CHR" -chardev socket,id=%s,path=%s%s"
 #define QEMU_CMD_NETDEV " -netdev vhost-user,id=hs0,chardev=%s,vhostforce=on"
 
@@ -195,6 +197,7 @@ enum test_memfd {
 TEST_MEMFD_AUTO,
 TEST_MEMFD_YES,
 TEST_MEMFD_NO,
+TEST_MEMFD_SHM,
 };
 
 static void append_vhost_net_opts(TestServer *s, GString *cmd_line,
@@ -228,6 +231,8 @@ static void append_mem_opts(TestServer *server, GString 
*cmd_line,
 
 if (memfd == TEST_MEMFD_YES) {
 g_string_append_printf(cmd_line, QEMU_CMD_MEMFD, size, size);
+} else if (memfd == TEST_MEMFD_SHM) {
+g_string_append_printf(cmd_line, QEMU_CMD_SHM, size, size);
 } else {
 const char *root = init_hugepagefs() ? : server->tmpfs;
 
@@ -788,6 +793,19 @@ static void *vhost_user_test_setup_memfd(GString 
*cmd_line, void *arg)
 return server;
 }
 
+static void *vhost_user_test_setup_shm(GString *cmd_line, void *arg)
+{
+TestServer *server = test_server_new("vhost-user-test", arg);
+test_server_listen(server);
+
+append_mem_opts(server, cmd_line, 256, TEST_MEMFD_SHM);
+server->vu_ops->append_opts(server, cmd_line, "");
+
+g_test_queue_destroy(vhost_user_test_cleanup, server);
+
+return server;
+}
+
 static void test_read_guest_mem(void *obj, void *arg, QGuestAllocator *alloc)
 {
 TestServer *server = arg;
@@ -1081,6 +1099,11 @@ static void register_vhost_user_test(void)
  "virtio-net",
  test_read_guest_mem, );
 
+opts.before = vhost_user_test_setup_shm;
+qos_add_test("vhost-user/read-guest-mem/shm",
+ "virtio-net",
+ test_read_guest_mem, );
+
 if (qemu_memfd_check(MFD_ALLOW_SEALING)) {
 opts.before = vhost_user_test_setup_memfd;
 qos_add_test("vhost-user/read-guest-mem/memfd",
-- 
2.45.1




[PATCH v5 07/13] vhost-user: enable frontends on any POSIX system

2024-05-23 Thread Stefano Garzarella
The vhost-user protocol is not really Linux-specific so let's enable
vhost-user frontends for any POSIX system.

In vhost_net.c we use VHOST_FILE_UNBIND which is defined in a Linux
specific header, let's define it for other systems as well.

Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Acked-by: Stefan Hajnoczi 
Signed-off-by: Stefano Garzarella 
---
 meson.build| 1 -
 hw/net/vhost_net.c | 5 +
 hw/block/Kconfig   | 2 +-
 3 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/meson.build b/meson.build
index a9de71d450..5a6f7a36eb 100644
--- a/meson.build
+++ b/meson.build
@@ -151,7 +151,6 @@ have_tpm = get_option('tpm') \
 
 # vhost
 have_vhost_user = get_option('vhost_user') \
-  .disable_auto_if(host_os != 'linux') \
   .require(host_os != 'windows',
error_message: 'vhost-user is not available on Windows').allowed()
 have_vhost_vdpa = get_option('vhost_vdpa') \
diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index fd1a93701a..fced429813 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -34,8 +34,13 @@
 #include "standard-headers/linux/virtio_ring.h"
 #include "hw/virtio/vhost.h"
 #include "hw/virtio/virtio-bus.h"
+#if defined(__linux__)
 #include "linux-headers/linux/vhost.h"
+#endif
 
+#ifndef VHOST_FILE_UNBIND
+#define VHOST_FILE_UNBIND -1
+#endif
 
 /* Features supported by host kernel. */
 static const int kernel_feature_bits[] = {
diff --git a/hw/block/Kconfig b/hw/block/Kconfig
index 9e8f28f982..29ee09e434 100644
--- a/hw/block/Kconfig
+++ b/hw/block/Kconfig
@@ -40,7 +40,7 @@ config VHOST_USER_BLK
 bool
 # Only PCI devices are provided for now
 default y if VIRTIO_PCI
-depends on VIRTIO && VHOST_USER && LINUX
+depends on VIRTIO && VHOST_USER
 
 config SWIM
 bool
-- 
2.45.1




[PATCH v5 08/13] libvhost-user: enable it on any POSIX system

2024-05-23 Thread Stefano Garzarella
The vhost-user protocol is not really Linux-specific so let's enable
libvhost-user for any POSIX system.

Compiling it on macOS and FreeBSD some problems came up:
- avoid to include linux/vhost.h which is available only on Linux
  (vhost_types.h contains many of the things we need)
- macOS doesn't provide sys/endian.h, so let's define them
  (note: libvhost-user doesn't include QEMU's headers, so we can't use
   use "qemu/bswap.h")
- define eventfd_[write|read] as write/read wrapper when system doesn't
  provide those (e.g. macOS)
- copy SEAL defines from include/qemu/memfd.h to make the code works
  on FreeBSD where MFD_ALLOW_SEALING is defined
- define MAP_NORESERVE if it's not defined (e.g. on FreeBSD)

Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Acked-by: Stefan Hajnoczi 
Signed-off-by: Stefano Garzarella 
---
v5:
- fixed typos in the commit description [Phil]
---
 meson.build   |  2 +-
 subprojects/libvhost-user/libvhost-user.h |  2 +-
 subprojects/libvhost-user/libvhost-user.c | 60 +--
 3 files changed, 59 insertions(+), 5 deletions(-)

diff --git a/meson.build b/meson.build
index 5a6f7a36eb..543105af2a 100644
--- a/meson.build
+++ b/meson.build
@@ -3155,7 +3155,7 @@ if have_system and vfio_user_server_allowed
 endif
 
 vhost_user = not_found
-if host_os == 'linux' and have_vhost_user
+if have_vhost_user
   libvhost_user = subproject('libvhost-user')
   vhost_user = libvhost_user.get_variable('vhost_user_dep')
 endif
diff --git a/subprojects/libvhost-user/libvhost-user.h 
b/subprojects/libvhost-user/libvhost-user.h
index deb40e77b3..e13e1d3931 100644
--- a/subprojects/libvhost-user/libvhost-user.h
+++ b/subprojects/libvhost-user/libvhost-user.h
@@ -18,9 +18,9 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include "standard-headers/linux/virtio_ring.h"
+#include "standard-headers/linux/vhost_types.h"
 
 /* Based on qemu/hw/virtio/vhost-user.c */
 #define VHOST_USER_F_PROTOCOL_FEATURES 30
diff --git a/subprojects/libvhost-user/libvhost-user.c 
b/subprojects/libvhost-user/libvhost-user.c
index 2c20cdc16e..57e58d4adb 100644
--- a/subprojects/libvhost-user/libvhost-user.c
+++ b/subprojects/libvhost-user/libvhost-user.c
@@ -28,9 +28,7 @@
 #include 
 #include 
 #include 
-#include 
 #include 
-#include 
 
 /* Necessary to provide VIRTIO_F_VERSION_1 on system
  * with older linux headers. Must appear before
@@ -39,8 +37,8 @@
 #include "standard-headers/linux/virtio_config.h"
 
 #if defined(__linux__)
+#include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -52,6 +50,62 @@
 
 #endif
 
+#if defined(__APPLE__) && (__MACH__)
+#include 
+#define htobe16(x) OSSwapHostToBigInt16(x)
+#define htole16(x) OSSwapHostToLittleInt16(x)
+#define be16toh(x) OSSwapBigToHostInt16(x)
+#define le16toh(x) OSSwapLittleToHostInt16(x)
+
+#define htobe32(x) OSSwapHostToBigInt32(x)
+#define htole32(x) OSSwapHostToLittleInt32(x)
+#define be32toh(x) OSSwapBigToHostInt32(x)
+#define le32toh(x) OSSwapLittleToHostInt32(x)
+
+#define htobe64(x) OSSwapHostToBigInt64(x)
+#define htole64(x) OSSwapHostToLittleInt64(x)
+#define be64toh(x) OSSwapBigToHostInt64(x)
+#define le64toh(x) OSSwapLittleToHostInt64(x)
+#endif
+
+#ifdef CONFIG_EVENTFD
+#include 
+#else
+#define eventfd_t uint64_t
+
+int eventfd_write(int fd, eventfd_t value)
+{
+return (write(fd, , sizeof(value)) == sizeof(value)) ? 0 : -1;
+}
+
+int eventfd_read(int fd, eventfd_t *value)
+{
+return (read(fd, value, sizeof(*value)) == sizeof(*value)) ? 0 : -1;
+}
+#endif
+
+#ifdef MFD_ALLOW_SEALING
+#include 
+
+#ifndef F_LINUX_SPECIFIC_BASE
+#define F_LINUX_SPECIFIC_BASE 1024
+#endif
+
+#ifndef F_ADD_SEALS
+#define F_ADD_SEALS (F_LINUX_SPECIFIC_BASE + 9)
+#define F_GET_SEALS (F_LINUX_SPECIFIC_BASE + 10)
+
+#define F_SEAL_SEAL 0x0001  /* prevent further seals from being set */
+#define F_SEAL_SHRINK   0x0002  /* prevent file from shrinking */
+#define F_SEAL_GROW 0x0004  /* prevent file from growing */
+#define F_SEAL_WRITE0x0008  /* prevent writes */
+#endif
+#endif
+
+#ifndef MAP_NORESERVE
+#define MAP_NORESERVE 0
+#endif
+
 #include "include/atomic.h"
 
 #include "libvhost-user.h"
-- 
2.45.1




[PATCH v5 12/13] tests/qtest/vhost-user-blk-test: use memory-backend-shm

2024-05-23 Thread Stefano Garzarella
`memory-backend-memfd` is available only on Linux while the new
`memory-backend-shm` can be used on any POSIX-compliant operating
system. Let's use it so we can run the test in multiple environments.

Acked-by: Thomas Huth 
Acked-by: Stefan Hajnoczi 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Signed-off-by: Stefano Garzarella 
---
 tests/qtest/vhost-user-blk-test.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qtest/vhost-user-blk-test.c 
b/tests/qtest/vhost-user-blk-test.c
index 117b9acd10..e945f6abf2 100644
--- a/tests/qtest/vhost-user-blk-test.c
+++ b/tests/qtest/vhost-user-blk-test.c
@@ -906,7 +906,7 @@ static void start_vhost_user_blk(GString *cmd_line, int 
vus_instances,
vhost_user_blk_bin);
 
 g_string_append_printf(cmd_line,
-" -object memory-backend-memfd,id=mem,size=256M,share=on "
+" -object memory-backend-shm,id=mem,size=256M,share=on "
 " -M memory-backend=mem -m 256M ");
 
 for (i = 0; i < vus_instances; i++) {
-- 
2.45.1




[PATCH v5 03/13] libvhost-user: mask F_INFLIGHT_SHMFD if memfd is not supported

2024-05-23 Thread Stefano Garzarella
libvhost-user will panic when receiving VHOST_USER_GET_INFLIGHT_FD
message if MFD_ALLOW_SEALING is not defined, since it's not able
to create a memfd.

VHOST_USER_GET_INFLIGHT_FD is used only if
VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD is negotiated. So, let's mask
that feature if the backend is not able to properly handle these
messages.

Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Acked-by: Stefan Hajnoczi 
Signed-off-by: Stefano Garzarella 
---
 subprojects/libvhost-user/libvhost-user.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/subprojects/libvhost-user/libvhost-user.c 
b/subprojects/libvhost-user/libvhost-user.c
index a11afd1960..2c20cdc16e 100644
--- a/subprojects/libvhost-user/libvhost-user.c
+++ b/subprojects/libvhost-user/libvhost-user.c
@@ -1674,6 +1674,17 @@ vu_get_protocol_features_exec(VuDev *dev, VhostUserMsg 
*vmsg)
 features |= dev->iface->get_protocol_features(dev);
 }
 
+#ifndef MFD_ALLOW_SEALING
+/*
+ * If MFD_ALLOW_SEALING is not defined, we are not able to handle
+ * VHOST_USER_GET_INFLIGHT_FD messages, since we can't create a memfd.
+ * Those messages are used only if VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD
+ * is negotiated. A device implementation can enable it, so let's mask
+ * it to avoid a runtime panic.
+ */
+features &= ~(1ULL << VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD);
+#endif
+
 vmsg_set_reply_u64(vmsg, features);
 return true;
 }
-- 
2.45.1




[PATCH v5 02/13] libvhost-user: fail vu_message_write() if sendmsg() is failing

2024-05-23 Thread Stefano Garzarella
In vu_message_write() we use sendmsg() to send the message header,
then a write() to send the payload.

If sendmsg() fails we should avoid sending the payload, since we
were unable to send the header.

Discovered before fixing the issue with the previous patch, where
sendmsg() failed on macOS due to wrong parameters, but the frontend
still sent the payload which the backend incorrectly interpreted
as a wrong header.

Reviewed-by: Daniel P. Berrangé 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Acked-by: Stefan Hajnoczi 
Signed-off-by: Stefano Garzarella 
---
 subprojects/libvhost-user/libvhost-user.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/subprojects/libvhost-user/libvhost-user.c 
b/subprojects/libvhost-user/libvhost-user.c
index 22bea0c775..a11afd1960 100644
--- a/subprojects/libvhost-user/libvhost-user.c
+++ b/subprojects/libvhost-user/libvhost-user.c
@@ -639,6 +639,11 @@ vu_message_write(VuDev *dev, int conn_fd, VhostUserMsg 
*vmsg)
 rc = sendmsg(conn_fd, , 0);
 } while (rc < 0 && (errno == EINTR || errno == EAGAIN));
 
+if (rc <= 0) {
+vu_panic(dev, "Error while writing: %s", strerror(errno));
+return false;
+}
+
 if (vmsg->size) {
 do {
 if (vmsg->data) {
-- 
2.45.1




[PATCH v5 09/13] osdep: move O_DSYNC and O_DIRECT defines from file-posix

2024-05-23 Thread Stefano Garzarella
These defines are also useful for vhost-user-blk when it is compiled
in some POSIX systems that do not define them, so let's move them to
“qemu/osdep.h”.

Suggested-by: Philippe Mathieu-Daudé 
Signed-off-by: Stefano Garzarella 
---
 include/qemu/osdep.h | 14 ++
 block/file-posix.c   | 14 --
 2 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
index f61edcfdc2..e165b5cb1b 100644
--- a/include/qemu/osdep.h
+++ b/include/qemu/osdep.h
@@ -325,6 +325,20 @@ void QEMU_ERROR("code path is reachable")
 #define ESHUTDOWN 4099
 #endif
 
+/* OS X does not have O_DSYNC */
+#ifndef O_DSYNC
+#ifdef O_SYNC
+#define O_DSYNC O_SYNC
+#elif defined(O_FSYNC)
+#define O_DSYNC O_FSYNC
+#endif
+#endif
+
+/* Approximate O_DIRECT with O_DSYNC if O_DIRECT isn't available */
+#ifndef O_DIRECT
+#define O_DIRECT O_DSYNC
+#endif
+
 #define RETRY_ON_EINTR(expr) \
 (__extension__  \
 ({ typeof(expr) __result;   \
diff --git a/block/file-posix.c b/block/file-posix.c
index 35684f7e21..7a196a2abf 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -110,20 +110,6 @@
 #include 
 #endif
 
-/* OS X does not have O_DSYNC */
-#ifndef O_DSYNC
-#ifdef O_SYNC
-#define O_DSYNC O_SYNC
-#elif defined(O_FSYNC)
-#define O_DSYNC O_FSYNC
-#endif
-#endif
-
-/* Approximate O_DIRECT with O_DSYNC if O_DIRECT isn't available */
-#ifndef O_DIRECT
-#define O_DIRECT O_DSYNC
-#endif
-
 #define FTYPE_FILE   0
 #define FTYPE_CD 1
 
-- 
2.45.1




[PATCH v5 04/13] vhost-user-server: do not set memory fd non-blocking

2024-05-23 Thread Stefano Garzarella
In vhost-user-server we set all fd received from the other peer
in non-blocking mode. For some of them (e.g. memfd, shm_open, etc.)
it's not really needed, because we don't use these fd with blocking
operations, but only to map memory.

In addition, in some systems this operation can fail (e.g. in macOS
setting an fd returned by shm_open() non-blocking fails with errno
= ENOTTY).

So, let's avoid setting fd non-blocking for those messages that we
know carry memory fd (e.g. VHOST_USER_ADD_MEM_REG,
VHOST_USER_SET_MEM_TABLE).

Reviewed-by: Daniel P. Berrangé 
Acked-by: Stefan Hajnoczi 
Signed-off-by: Stefano Garzarella 
---
v3:
- avoiding setting fd non-blocking for messages where we have memory fd
  (Eric)
---
 util/vhost-user-server.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
index 3bfb1ad3ec..b19229074a 100644
--- a/util/vhost-user-server.c
+++ b/util/vhost-user-server.c
@@ -65,6 +65,18 @@ static void vmsg_close_fds(VhostUserMsg *vmsg)
 static void vmsg_unblock_fds(VhostUserMsg *vmsg)
 {
 int i;
+
+/*
+ * These messages carry fd used to map memory, not to send/receive 
messages,
+ * so this operation is useless. In addition, in some systems this
+ * operation can fail (e.g. in macOS setting an fd returned by shm_open()
+ * non-blocking fails with errno = ENOTTY)
+ */
+if (vmsg->request == VHOST_USER_ADD_MEM_REG ||
+vmsg->request == VHOST_USER_SET_MEM_TABLE) {
+return;
+}
+
 for (i = 0; i < vmsg->fd_num; i++) {
 qemu_socket_set_nonblock(vmsg->fds[i]);
 }
-- 
2.45.1




[PATCH v5 06/13] contrib/vhost-user-*: use QEMU bswap helper functions

2024-05-23 Thread Stefano Garzarella
Let's replace the calls to le*toh() and htole*() with qemu/bswap.h
helpers to make the code more portable.

Suggested-by: Philippe Mathieu-Daudé 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Acked-by: Stefan Hajnoczi 
Signed-off-by: Stefano Garzarella 
---
 contrib/vhost-user-blk/vhost-user-blk.c |  9 +
 contrib/vhost-user-input/main.c | 16 
 2 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/contrib/vhost-user-blk/vhost-user-blk.c 
b/contrib/vhost-user-blk/vhost-user-blk.c
index a8ab9269a2..9492146855 100644
--- a/contrib/vhost-user-blk/vhost-user-blk.c
+++ b/contrib/vhost-user-blk/vhost-user-blk.c
@@ -16,6 +16,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/bswap.h"
 #include "standard-headers/linux/virtio_blk.h"
 #include "libvhost-user-glib.h"
 
@@ -194,8 +195,8 @@ vub_discard_write_zeroes(VubReq *req, struct iovec *iov, 
uint32_t iovcnt,
 #if defined(__linux__) && defined(BLKDISCARD) && defined(BLKZEROOUT)
 VubDev *vdev_blk = req->vdev_blk;
 desc = buf;
-uint64_t range[2] = { le64toh(desc->sector) << 9,
-  le32toh(desc->num_sectors) << 9 };
+uint64_t range[2] = { le64_to_cpu(desc->sector) << 9,
+  le32_to_cpu(desc->num_sectors) << 9 };
 if (type == VIRTIO_BLK_T_DISCARD) {
 if (ioctl(vdev_blk->blk_fd, BLKDISCARD, range) == 0) {
 g_free(buf);
@@ -267,13 +268,13 @@ static int vub_virtio_process_req(VubDev *vdev_blk,
 req->in = (struct virtio_blk_inhdr *)elem->in_sg[in_num - 1].iov_base;
 in_num--;
 
-type = le32toh(req->out->type);
+type = le32_to_cpu(req->out->type);
 switch (type & ~VIRTIO_BLK_T_BARRIER) {
 case VIRTIO_BLK_T_IN:
 case VIRTIO_BLK_T_OUT: {
 ssize_t ret = 0;
 bool is_write = type & VIRTIO_BLK_T_OUT;
-req->sector_num = le64toh(req->out->sector);
+req->sector_num = le64_to_cpu(req->out->sector);
 if (is_write) {
 ret  = vub_writev(req, >out_sg[1], out_num);
 } else {
diff --git a/contrib/vhost-user-input/main.c b/contrib/vhost-user-input/main.c
index 081230da54..f3362d41ac 100644
--- a/contrib/vhost-user-input/main.c
+++ b/contrib/vhost-user-input/main.c
@@ -51,8 +51,8 @@ static void vi_input_send(VuInput *vi, struct 
virtio_input_event *event)
 vi->queue[vi->qindex++].event = *event;
 
 /* ... until we see a report sync ... */
-if (event->type != htole16(EV_SYN) ||
-event->code != htole16(SYN_REPORT)) {
+if (event->type != cpu_to_le16(EV_SYN) ||
+event->code != cpu_to_le16(SYN_REPORT)) {
 return;
 }
 
@@ -103,9 +103,9 @@ vi_evdev_watch(VuDev *dev, int condition, void *data)
 
 g_debug("input %d %d %d", evdev.type, evdev.code, evdev.value);
 
-virtio.type  = htole16(evdev.type);
-virtio.code  = htole16(evdev.code);
-virtio.value = htole32(evdev.value);
+virtio.type  = cpu_to_le16(evdev.type);
+virtio.code  = cpu_to_le16(evdev.code);
+virtio.value = cpu_to_le32(evdev.value);
 vi_input_send(vi, );
 }
 }
@@ -124,9 +124,9 @@ static void vi_handle_status(VuInput *vi, 
virtio_input_event *event)
 
 evdev.input_event_sec = tval.tv_sec;
 evdev.input_event_usec = tval.tv_usec;
-evdev.type = le16toh(event->type);
-evdev.code = le16toh(event->code);
-evdev.value = le32toh(event->value);
+evdev.type = le16_to_cpu(event->type);
+evdev.code = le16_to_cpu(event->code);
+evdev.value = le32_to_cpu(event->value);
 
 rc = write(vi->evdevfd, , sizeof(evdev));
 if (rc == -1) {
-- 
2.45.1




[PATCH v5 01/13] libvhost-user: set msg.msg_control to NULL when it is empty

2024-05-23 Thread Stefano Garzarella
On some OS (e.g. macOS) sendmsg() returns -1 (errno EINVAL) if
the `struct msghdr` has the field `msg_controllen` set to 0, but
`msg_control` is not NULL.

Reviewed-by: Eric Blake 
Reviewed-by: David Hildenbrand 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Acked-by: Stefan Hajnoczi 
Signed-off-by: Stefano Garzarella 
---
 subprojects/libvhost-user/libvhost-user.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/subprojects/libvhost-user/libvhost-user.c 
b/subprojects/libvhost-user/libvhost-user.c
index a879149fef..22bea0c775 100644
--- a/subprojects/libvhost-user/libvhost-user.c
+++ b/subprojects/libvhost-user/libvhost-user.c
@@ -632,6 +632,7 @@ vu_message_write(VuDev *dev, int conn_fd, VhostUserMsg 
*vmsg)
 memcpy(CMSG_DATA(cmsg), vmsg->fds, fdsize);
 } else {
 msg.msg_controllen = 0;
+msg.msg_control = NULL;
 }
 
 do {
-- 
2.45.1




[PATCH v5 00/13] vhost-user: support any POSIX system (tested on macOS, FreeBSD, OpenBSD)

2024-05-23 Thread Stefano Garzarella
v1: https://patchew.org/QEMU/20240228114759.44758-1-sgarz...@redhat.com/
v2: https://patchew.org/QEMU/20240326133936.125332-1-sgarz...@redhat.com/
v3: https://patchew.org/QEMU/20240404122330.92710-1-sgarz...@redhat.com/
v4: https://patchew.org/QEMU/20240508074457.12367-1-sgarz...@redhat.com/
v5:
- rebased on 7e1c0047015ffbd408e1aa4a5ec1abe4751dbf7e
- added some R-b/A-b/T-b tags [Daniel, Phil, Thomas, Stefan thanks!]
- added new patch to move O_DSYNC and O_DIRECT defines in osdep [Phil]
- fixed memory-backend-shm documentation in qapi/qom.json and qemu-options.hx
  [Markus]
- fixed typos in some commits description [Phil]

The vhost-user protocol is not really Linux-specific, so let's try support
QEMU's frontends and backends (including libvhost-user) in any POSIX system
with this series. The main use case is to be able to use virtio devices that
we don't have built-in in QEMU (e.g. virtiofsd, vhost-user-vsock, etc.) even
in non-Linux systems.

The first 5 patches are more like fixes discovered at runtime on macOS or
FreeBSD that could go even independently of this series.

Patches 6, 7, 8, 9, and 10 enable building of frontends and backends (including
libvhost-user) with associated code changes to succeed in compilation.

Patch 11 adds `memory-backend-shm` that uses the POSIX shm_open() API to
create shared memory which is identified by an fd that can be shared with
vhost-user backends. This is useful on those systems (like macOS) where
we don't have memfd_create() or special filesystems like "/dev/shm".

Patches 12 and 13 use `memory-backend-shm` in some vhost-user tests.

Maybe the first 5 patches can go separately, but I only discovered those
problems after testing patches 6 - 10, so I have included them in this series
for now. Please let me know if you prefer that I send them separately.

I tested this series using vhost-user-blk and QSD on macOS Sonoma 14.4
(aarch64), FreeBSD 14 (x86_64), OpenBSD 7.4 (x86_64), and Fedora 40 (x86_64)
in this way:

- Start vhost-user-blk or QSD (same commands for all systems)

  vhost-user-blk -s /tmp/vhost.socket \
-b Fedora-Cloud-Base-39-1.5.x86_64.raw

  qemu-storage-daemon \
--blockdev 
file,filename=Fedora-Cloud-Base-39-1.5.x86_64.qcow2,node-name=file \
--blockdev qcow2,file=file,node-name=qcow2 \
--export 
vhost-user-blk,addr.type=unix,addr.path=/tmp/vhost.socket,id=vub,num-queues=1,node-name=qcow2,writable=on

- macOS (aarch64): start QEMU (using hvf accelerator)

  qemu-system-aarch64 -smp 2 -cpu host -M virt,accel=hvf,memory-backend=mem \
-drive 
file=./build/pc-bios/edk2-aarch64-code.fd,if=pflash,format=raw,readonly=on \
-device virtio-net-device,netdev=net0 -netdev user,id=net0 \
-device ramfb -device usb-ehci -device usb-kbd \
-object memory-backend-shm,id=mem,size=512M \
-device vhost-user-blk-pci,num-queues=1,disable-legacy=on,chardev=char0 \
-chardev socket,id=char0,path=/tmp/vhost.socket

- FreeBSD/OpenBSD (x86_64): start QEMU (no accelerators available)

  qemu-system-x86_64 -smp 2 -M q35,memory-backend=mem \
-object memory-backend-shm,id=mem,size="512M" \
-device vhost-user-blk-pci,num-queues=1,chardev=char0 \
-chardev socket,id=char0,path=/tmp/vhost.socket

- Fedora (x86_64): start QEMU (using kvm accelerator)

  qemu-system-x86_64 -smp 2 -M q35,accel=kvm,memory-backend=mem \
-object memory-backend-shm,size="512M" \
-device vhost-user-blk-pci,num-queues=1,chardev=char0 \
-chardev socket,id=char0,path=/tmp/vhost.socket

Branch pushed (and CI started) at 
https://gitlab.com/sgarzarella/qemu/-/tree/macos-vhost-user?ref_type=heads

Thanks,
Stefano

Stefano Garzarella (13):
  libvhost-user: set msg.msg_control to NULL when it is empty
  libvhost-user: fail vu_message_write() if sendmsg() is failing
  libvhost-user: mask F_INFLIGHT_SHMFD if memfd is not supported
  vhost-user-server: do not set memory fd non-blocking
  contrib/vhost-user-blk: fix bind() using the right size of the address
  contrib/vhost-user-*: use QEMU bswap helper functions
  vhost-user: enable frontends on any POSIX system
  libvhost-user: enable it on any POSIX system
  osdep: move O_DSYNC and O_DIRECT defines from file-posix
  contrib/vhost-user-blk: enable it on any POSIX system
  hostmem: add a new memory backend based on POSIX shm_open()
  tests/qtest/vhost-user-blk-test: use memory-backend-shm
  tests/qtest/vhost-user-test: add a test case for memory-backend-shm

 docs/system/devices/vhost-user.rst|   5 +-
 meson.build   |   5 +-
 qapi/qom.json |  19 
 include/qemu/osdep.h  |  14 +++
 subprojects/libvhost-user/libvhost-user.h |   2 +-
 backends/hostmem-shm.c| 123 ++
 block/file-posix.c|  14 ---
 contrib/vhost-user-blk/vhost-user-blk.c   |  13 ++-
 contrib/vhost-user-input/main.c   |  16 +--
 hw/net/vhost_net.c  

Re: [PATCH v4 10/12] hostmem: add a new memory backend based on POSIX shm_open()

2024-05-10 Thread Stefano Garzarella

On Wed, May 08, 2024 at 01:59:33PM GMT, Markus Armbruster wrote:

Stefano Garzarella  writes:


shm_open() creates and opens a new POSIX shared memory object.
A POSIX shared memory object allows creating memory backend with an
associated file descriptor that can be shared with external processes
(e.g. vhost-user).

The new `memory-backend-shm` can be used as an alternative when
`memory-backend-memfd` is not available (Linux only), since shm_open()
should be provided by any POSIX-compliant operating system.

This backend mimics memfd, allocating memory that is practically
anonymous. In theory shm_open() requires a name, but this is allocated
for a short time interval and shm_unlink() is called right after
shm_open(). After that, only fd is shared with external processes
(e.g., vhost-user) as if it were associated with anonymous memory.

In the future we may also allow the user to specify the name to be
passed to shm_open(), but for now we keep the backend simple, mimicking
anonymous memory such as memfd.

Acked-by: David Hildenbrand 
Signed-off-by: Stefano Garzarella 
---
v4
- fail if we find "share=off" in shm_backend_memory_alloc() [David]
v3
- enriched commit message and documentation to highlight that we
  want to mimic memfd (David)
---
 docs/system/devices/vhost-user.rst |   5 +-
 qapi/qom.json  |  17 
 backends/hostmem-shm.c | 123 +
 backends/meson.build   |   1 +
 qemu-options.hx|  13 +++
 5 files changed, 157 insertions(+), 2 deletions(-)
 create mode 100644 backends/hostmem-shm.c

diff --git a/docs/system/devices/vhost-user.rst 
b/docs/system/devices/vhost-user.rst
index 9b2da106ce..35259d8ec7 100644
--- a/docs/system/devices/vhost-user.rst
+++ b/docs/system/devices/vhost-user.rst
@@ -98,8 +98,9 @@ Shared memory object

 In order for the daemon to access the VirtIO queues to process the
 requests it needs access to the guest's address space. This is
-achieved via the ``memory-backend-file`` or ``memory-backend-memfd``
-objects. A reference to a file-descriptor which can access this object
+achieved via the ``memory-backend-file``, ``memory-backend-memfd``, or
+``memory-backend-shm`` objects.
+A reference to a file-descriptor which can access this object
 will be passed via the socket as part of the protocol negotiation.

 Currently the shared memory object needs to match the size of the main
diff --git a/qapi/qom.json b/qapi/qom.json
index 38dde6d785..52df052df8 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -721,6 +721,19 @@
 '*hugetlbsize': 'size',
 '*seal': 'bool' } }

+##
+# @MemoryBackendShmProperties:
+#
+# Properties for memory-backend-shm objects.
+#
+# The @share boolean option is true by default with shm.


This contradicts the doc comment for @share:

  # @share: if false, the memory is private to QEMU; if true, it is
  # shared (default: false)

Your intention is to override that text.  But that's less than clear.
Moreover, the documentation of @share is pretty far from this override.
John Snow is working on patches that'll pull it closer.

Hmm, MemoryBackendMemfdProperties has the same override.


Yep, I followed @MemoryBackendMemfdProperties and 
@MemoryBackendEpcProperties.




I think we should change the doc comment for @share to something like

  # @share: if false, the memory is private to QEMU; if true, it is
  # shared (default depends on the backend type)

and then document the actual default with each backend type.


I agree on that, but I think we should do in a separate series/patch.
If you prefer, I can send that patch.




+#
+# Since: 9.1
+##
+{ 'struct': 'MemoryBackendShmProperties',
+  'base': 'MemoryBackendProperties',
+  'data': { } }


Let's add 'if': 'CONFIG_POSIX' here.



Do you mean something like this:

{ 'struct': 'MemoryBackendShmProperties',
  'if': 'CONFIG_POSIX',
  'base': 'MemoryBackendProperties',
  'data': { } }

I didn't because for MemoryBackendMemfdProperties and
MemoryBackendEpcProperties we have 'if': 'CONFIG_POSIX' only later in
the ObjectOptions union, so I did the same.

Should we fix them as well?


+
 ##
 # @MemoryBackendEpcProperties:
 #
@@ -985,6 +998,8 @@
 { 'name': 'memory-backend-memfd',
   'if': 'CONFIG_LINUX' },
 'memory-backend-ram',
+{ 'name': 'memory-backend-shm',
+  'if': 'CONFIG_POSIX' },
 'pef-guest',
 { 'name': 'pr-manager-helper',
   'if': 'CONFIG_LINUX' },
@@ -1056,6 +1071,8 @@
   'memory-backend-memfd':   { 'type': 'MemoryBackendMemfdProperties',
   'if': 'CONFIG_LINUX' },
   'memory-backend-ram': 'MemoryBackendProperties',
+  'memory-backend-shm': { 'type': 'MemoryBackendShmProperties',
+  'if': 'CONFIG_POSIX' },
   'pr-manager-helper':  { 'type': 'PrManagerHelperProperties',
   'if': 'CO

Re: [PATCH v4 09/12] contrib/vhost-user-blk: enable it on any POSIX system

2024-05-10 Thread Stefano Garzarella

On Wed, May 08, 2024 at 12:32:08PM GMT, Philippe Mathieu-Daudé wrote:

On 8/5/24 09:44, Stefano Garzarella wrote:

Let's make the code more portable by adding defines from
block/file-posix.c to support O_DIRECT in other systems (e.g. macOS).

vhost-user-server.c is a dependency, let's enable it for any POSIX
system.

Signed-off-by: Stefano Garzarella 
---
v4:
- moved using of "qemu/bswap.h" API in a separate patch [Phil]
---
 meson.build |  2 --
 contrib/vhost-user-blk/vhost-user-blk.c | 14 ++
 util/meson.build|  4 +++-
 3 files changed, 17 insertions(+), 3 deletions(-)




diff --git a/contrib/vhost-user-blk/vhost-user-blk.c 
b/contrib/vhost-user-blk/vhost-user-blk.c
index 9492146855..a450337685 100644
--- a/contrib/vhost-user-blk/vhost-user-blk.c
+++ b/contrib/vhost-user-blk/vhost-user-blk.c
@@ -25,6 +25,20 @@
 #include 
 #endif
+/* OS X does not have O_DSYNC */
+#ifndef O_DSYNC
+#ifdef O_SYNC
+#define O_DSYNC O_SYNC
+#elif defined(O_FSYNC)
+#define O_DSYNC O_FSYNC
+#endif
+#endif
+
+/* Approximate O_DIRECT with O_DSYNC if O_DIRECT isn't available */
+#ifndef O_DIRECT
+#define O_DIRECT O_DSYNC
+#endif


Could we add that in "qemu/osdep.h" instead?


Since "qemu/osdep.h" includes fcntl.h, I think it could be fine.

@Hanna, @Kevin WDYT?



Otherwise,
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 



Thanks,
Stefano




Re: [PATCH v4 08/12] libvhost-user: enable it on any POSIX system

2024-05-10 Thread Stefano Garzarella

On Wed, May 08, 2024 at 12:36:30PM GMT, Philippe Mathieu-Daudé wrote:

On 8/5/24 09:44, Stefano Garzarella wrote:

The vhost-user protocol is not really Linux-specific so let's enable
libvhost-user for any POSIX system.

Compiling it on macOS and FreeBSD some problems came up:
- avoid to include linux/vhost.h which is avaibale only on Linux


"available"


  (vhost_types.h contains many of the things we need)
- macOS doesn't provide sys/endian.h, so let's define them
  (note: libvhost-user doesn't include qemu's headers, so we can't use


"QEMU"



Good catches, I'll fix them!


   use "qemu/bswap.h")
- define eventfd_[write|read] as write/read wrapper when system doesn't
  provide those (e.g. macOS)
- copy SEAL defines from include/qemu/memfd.h to make the code works
  on FreeBSD where MFD_ALLOW_SEALING is defined


Alternatively add in subprojects/libvhost-user/include/osdep.h.


I like the idea, but we also have other things already present before
this patch (e.g. G_GNUC_PRINTF, MIN, etc.) so do you think it's better
to add 2 patches (move everything to osdep.h, add things from this
patch), or after this series is merged, send a patch to introduce
osdep.h?

I'm tempted for the last option just to prevent this series from
becoming too big, but I don't have a strong opinion.

Thanks,
Stefano




- define MAP_NORESERVE if it's not defined (e.g. on FreeBSD)

Signed-off-by: Stefano Garzarella 
---
 meson.build   |  2 +-
 subprojects/libvhost-user/libvhost-user.h |  2 +-
 subprojects/libvhost-user/libvhost-user.c | 60 +--
 3 files changed, 59 insertions(+), 5 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 







Re: [PATCH v4 03/12] libvhost-user: mask F_INFLIGHT_SHMFD if memfd is not supported

2024-05-10 Thread Stefano Garzarella

On Wed, May 08, 2024 at 12:39:33PM GMT, Philippe Mathieu-Daudé wrote:

On 8/5/24 09:44, Stefano Garzarella wrote:

libvhost-user will panic when receiving VHOST_USER_GET_INFLIGHT_FD
message if MFD_ALLOW_SEALING is not defined, since it's not able
to create a memfd.

VHOST_USER_GET_INFLIGHT_FD is used only if
VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD is negotiated. So, let's mask
that feature if the backend is not able to properly handle these
messages.

Signed-off-by: Stefano Garzarella 
---
 subprojects/libvhost-user/libvhost-user.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/subprojects/libvhost-user/libvhost-user.c 
b/subprojects/libvhost-user/libvhost-user.c
index a11afd1960..1c361ffd51 100644
--- a/subprojects/libvhost-user/libvhost-user.c
+++ b/subprojects/libvhost-user/libvhost-user.c
@@ -1674,6 +1674,16 @@ vu_get_protocol_features_exec(VuDev *dev, VhostUserMsg 
*vmsg)
 features |= dev->iface->get_protocol_features(dev);
 }


Maybe move the #ifndef here?


Yep, I'll do.




+/*
+ * If MFD_ALLOW_SEALING is not defined, we are not able to handle
+ * VHOST_USER_GET_INFLIGHT_FD messages, since we can't create a memfd.
+ * Those messages are used only if VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD
+ * is negotiated. A device implementation can enable it, so let's mask
+ * it to avoid a runtime panic.
+ */
+#ifndef MFD_ALLOW_SEALING
+features &= ~(1ULL << VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD);
+#endif
 vmsg_set_reply_u64(vmsg, features);
 return true;
 }


Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 



Thanks,
Stefano




Re: [PATCH v4 01/12] libvhost-user: set msg.msg_control to NULL when it is empty

2024-05-08 Thread Stefano Garzarella

On Wed, May 08, 2024 at 09:57:13AM GMT, Daniel P. Berrangé wrote:

On Wed, May 08, 2024 at 09:44:45AM +0200, Stefano Garzarella wrote:

On some OS (e.g. macOS) sendmsg() returns -1 (errno EINVAL) if
the `struct msghdr` has the field `msg_controllen` set to 0, but
`msg_control` is not NULL.

Reviewed-by: Eric Blake 
Reviewed-by: David Hildenbrand 
Reviewed-by: Philippe Mathieu-Daud?? 


Philippe's name has got mangled here


Thank you for bringing this to my attention and helping me solve it
off-list.

It should be fixed with the next posting!

Stefano




Signed-off-by: Stefano Garzarella 
---
 subprojects/libvhost-user/libvhost-user.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/subprojects/libvhost-user/libvhost-user.c 
b/subprojects/libvhost-user/libvhost-user.c
index a879149fef..22bea0c775 100644
--- a/subprojects/libvhost-user/libvhost-user.c
+++ b/subprojects/libvhost-user/libvhost-user.c
@@ -632,6 +632,7 @@ vu_message_write(VuDev *dev, int conn_fd, VhostUserMsg 
*vmsg)
 memcpy(CMSG_DATA(cmsg), vmsg->fds, fdsize);
 } else {
 msg.msg_controllen = 0;
+msg.msg_control = NULL;
 }

 do {
--
2.45.0



With regards,
Daniel
--
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|






[PATCH v4 04/12] vhost-user-server: do not set memory fd non-blocking

2024-05-08 Thread Stefano Garzarella
In vhost-user-server we set all fd received from the other peer
in non-blocking mode. For some of them (e.g. memfd, shm_open, etc.)
it's not really needed, because we don't use these fd with blocking
operations, but only to map memory.

In addition, in some systems this operation can fail (e.g. in macOS
setting an fd returned by shm_open() non-blocking fails with errno
= ENOTTY).

So, let's avoid setting fd non-blocking for those messages that we
know carry memory fd (e.g. VHOST_USER_ADD_MEM_REG,
VHOST_USER_SET_MEM_TABLE).

Signed-off-by: Stefano Garzarella 
---
v3:
- avoiding setting fd non-blocking for messages where we have memory fd
  (Eric)
---
 util/vhost-user-server.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
index 3bfb1ad3ec..b19229074a 100644
--- a/util/vhost-user-server.c
+++ b/util/vhost-user-server.c
@@ -65,6 +65,18 @@ static void vmsg_close_fds(VhostUserMsg *vmsg)
 static void vmsg_unblock_fds(VhostUserMsg *vmsg)
 {
 int i;
+
+/*
+ * These messages carry fd used to map memory, not to send/receive 
messages,
+ * so this operation is useless. In addition, in some systems this
+ * operation can fail (e.g. in macOS setting an fd returned by shm_open()
+ * non-blocking fails with errno = ENOTTY)
+ */
+if (vmsg->request == VHOST_USER_ADD_MEM_REG ||
+vmsg->request == VHOST_USER_SET_MEM_TABLE) {
+return;
+}
+
 for (i = 0; i < vmsg->fd_num; i++) {
 qemu_socket_set_nonblock(vmsg->fds[i]);
 }
-- 
2.45.0




[PATCH v4 02/12] libvhost-user: fail vu_message_write() if sendmsg() is failing

2024-05-08 Thread Stefano Garzarella
In vu_message_write() we use sendmsg() to send the message header,
then a write() to send the payload.

If sendmsg() fails we should avoid sending the payload, since we
were unable to send the header.

Discovered before fixing the issue with the previous patch, where
sendmsg() failed on macOS due to wrong parameters, but the frontend
still sent the payload which the backend incorrectly interpreted
as a wrong header.

Signed-off-by: Stefano Garzarella 
---
 subprojects/libvhost-user/libvhost-user.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/subprojects/libvhost-user/libvhost-user.c 
b/subprojects/libvhost-user/libvhost-user.c
index 22bea0c775..a11afd1960 100644
--- a/subprojects/libvhost-user/libvhost-user.c
+++ b/subprojects/libvhost-user/libvhost-user.c
@@ -639,6 +639,11 @@ vu_message_write(VuDev *dev, int conn_fd, VhostUserMsg 
*vmsg)
 rc = sendmsg(conn_fd, , 0);
 } while (rc < 0 && (errno == EINTR || errno == EAGAIN));
 
+if (rc <= 0) {
+vu_panic(dev, "Error while writing: %s", strerror(errno));
+return false;
+}
+
 if (vmsg->size) {
 do {
 if (vmsg->data) {
-- 
2.45.0




[PATCH v4 10/12] hostmem: add a new memory backend based on POSIX shm_open()

2024-05-08 Thread Stefano Garzarella
shm_open() creates and opens a new POSIX shared memory object.
A POSIX shared memory object allows creating memory backend with an
associated file descriptor that can be shared with external processes
(e.g. vhost-user).

The new `memory-backend-shm` can be used as an alternative when
`memory-backend-memfd` is not available (Linux only), since shm_open()
should be provided by any POSIX-compliant operating system.

This backend mimics memfd, allocating memory that is practically
anonymous. In theory shm_open() requires a name, but this is allocated
for a short time interval and shm_unlink() is called right after
shm_open(). After that, only fd is shared with external processes
(e.g., vhost-user) as if it were associated with anonymous memory.

In the future we may also allow the user to specify the name to be
passed to shm_open(), but for now we keep the backend simple, mimicking
anonymous memory such as memfd.

Acked-by: David Hildenbrand 
Signed-off-by: Stefano Garzarella 
---
v4
- fail if we find "share=off" in shm_backend_memory_alloc() [David]
v3
- enriched commit message and documentation to highlight that we
  want to mimic memfd (David)
---
 docs/system/devices/vhost-user.rst |   5 +-
 qapi/qom.json  |  17 
 backends/hostmem-shm.c | 123 +
 backends/meson.build   |   1 +
 qemu-options.hx|  13 +++
 5 files changed, 157 insertions(+), 2 deletions(-)
 create mode 100644 backends/hostmem-shm.c

diff --git a/docs/system/devices/vhost-user.rst 
b/docs/system/devices/vhost-user.rst
index 9b2da106ce..35259d8ec7 100644
--- a/docs/system/devices/vhost-user.rst
+++ b/docs/system/devices/vhost-user.rst
@@ -98,8 +98,9 @@ Shared memory object
 
 In order for the daemon to access the VirtIO queues to process the
 requests it needs access to the guest's address space. This is
-achieved via the ``memory-backend-file`` or ``memory-backend-memfd``
-objects. A reference to a file-descriptor which can access this object
+achieved via the ``memory-backend-file``, ``memory-backend-memfd``, or
+``memory-backend-shm`` objects.
+A reference to a file-descriptor which can access this object
 will be passed via the socket as part of the protocol negotiation.
 
 Currently the shared memory object needs to match the size of the main
diff --git a/qapi/qom.json b/qapi/qom.json
index 38dde6d785..52df052df8 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -721,6 +721,19 @@
 '*hugetlbsize': 'size',
 '*seal': 'bool' } }
 
+##
+# @MemoryBackendShmProperties:
+#
+# Properties for memory-backend-shm objects.
+#
+# The @share boolean option is true by default with shm.
+#
+# Since: 9.1
+##
+{ 'struct': 'MemoryBackendShmProperties',
+  'base': 'MemoryBackendProperties',
+  'data': { } }
+
 ##
 # @MemoryBackendEpcProperties:
 #
@@ -985,6 +998,8 @@
 { 'name': 'memory-backend-memfd',
   'if': 'CONFIG_LINUX' },
 'memory-backend-ram',
+{ 'name': 'memory-backend-shm',
+  'if': 'CONFIG_POSIX' },
 'pef-guest',
 { 'name': 'pr-manager-helper',
   'if': 'CONFIG_LINUX' },
@@ -1056,6 +1071,8 @@
   'memory-backend-memfd':   { 'type': 'MemoryBackendMemfdProperties',
   'if': 'CONFIG_LINUX' },
   'memory-backend-ram': 'MemoryBackendProperties',
+  'memory-backend-shm': { 'type': 'MemoryBackendShmProperties',
+  'if': 'CONFIG_POSIX' },
   'pr-manager-helper':  { 'type': 'PrManagerHelperProperties',
   'if': 'CONFIG_LINUX' },
   'qtest':  'QtestProperties',
diff --git a/backends/hostmem-shm.c b/backends/hostmem-shm.c
new file mode 100644
index 00..374edc3db8
--- /dev/null
+++ b/backends/hostmem-shm.c
@@ -0,0 +1,123 @@
+/*
+ * QEMU host POSIX shared memory object backend
+ *
+ * Copyright (C) 2024 Red Hat Inc
+ *
+ * Authors:
+ *   Stefano Garzarella 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "sysemu/hostmem.h"
+#include "qapi/error.h"
+
+#define TYPE_MEMORY_BACKEND_SHM "memory-backend-shm"
+
+OBJECT_DECLARE_SIMPLE_TYPE(HostMemoryBackendShm, MEMORY_BACKEND_SHM)
+
+struct HostMemoryBackendShm {
+HostMemoryBackend parent_obj;
+};
+
+static bool
+shm_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
+{
+g_autoptr(GString) shm_name = g_string_new(NULL);
+g_autofree char *backend_name = NULL;
+uint32_t ram_flags;
+int fd, oflag;
+mode_t mode;
+
+if (!backend->size) {
+error_setg(errp, "can't create shm backend with size 0");
+return false;
+}
+
+if (!backend->share) {
+error_setg(errp, "can't create shm backend with `share=off`");
+return fals

[PATCH v4 11/12] tests/qtest/vhost-user-blk-test: use memory-backend-shm

2024-05-08 Thread Stefano Garzarella
`memory-backend-memfd` is available only on Linux while the new
`memory-backend-shm` can be used on any POSIX-compliant operating
system. Let's use it so we can run the test in multiple environments.

Signed-off-by: Stefano Garzarella 
---
 tests/qtest/vhost-user-blk-test.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qtest/vhost-user-blk-test.c 
b/tests/qtest/vhost-user-blk-test.c
index 117b9acd10..e945f6abf2 100644
--- a/tests/qtest/vhost-user-blk-test.c
+++ b/tests/qtest/vhost-user-blk-test.c
@@ -906,7 +906,7 @@ static void start_vhost_user_blk(GString *cmd_line, int 
vus_instances,
vhost_user_blk_bin);
 
 g_string_append_printf(cmd_line,
-" -object memory-backend-memfd,id=mem,size=256M,share=on "
+" -object memory-backend-shm,id=mem,size=256M,share=on "
 " -M memory-backend=mem -m 256M ");
 
 for (i = 0; i < vus_instances; i++) {
-- 
2.45.0




[PATCH v4 05/12] contrib/vhost-user-blk: fix bind() using the right size of the address

2024-05-08 Thread Stefano Garzarella
On macOS passing `-s /tmp/vhost.socket` parameter to the vhost-user-blk
application, the bind was done on `/tmp/vhost.socke` pathname,
missing the last character.

This sounds like one of the portability problems described in the
unix(7) manpage:

   Pathname sockets
   When  binding  a socket to a pathname, a few rules should
   be observed for maximum portability and ease of coding:

   •  The pathname in sun_path should be null-terminated.

   •  The length of the pathname, including the  terminating
  null byte, should not exceed the size of sun_path.

   •  The  addrlen  argument  that  describes  the enclosing
  sockaddr_un structure should have a value of at least:

  offsetof(struct sockaddr_un, sun_path) +
  strlen(addr.sun_path)+1

  or,  more  simply,  addrlen  can   be   specified   as
  sizeof(struct sockaddr_un).

So let's follow the last advice and simplify the code as well.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Stefano Garzarella 
---
 contrib/vhost-user-blk/vhost-user-blk.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/contrib/vhost-user-blk/vhost-user-blk.c 
b/contrib/vhost-user-blk/vhost-user-blk.c
index 89e5f11a64..a8ab9269a2 100644
--- a/contrib/vhost-user-blk/vhost-user-blk.c
+++ b/contrib/vhost-user-blk/vhost-user-blk.c
@@ -469,7 +469,6 @@ static int unix_sock_new(char *unix_fn)
 {
 int sock;
 struct sockaddr_un un;
-size_t len;
 
 assert(unix_fn);
 
@@ -481,10 +480,9 @@ static int unix_sock_new(char *unix_fn)
 
 un.sun_family = AF_UNIX;
 (void)snprintf(un.sun_path, sizeof(un.sun_path), "%s", unix_fn);
-len = sizeof(un.sun_family) + strlen(un.sun_path);
 
 (void)unlink(unix_fn);
-if (bind(sock, (struct sockaddr *), len) < 0) {
+if (bind(sock, (struct sockaddr *), sizeof(un)) < 0) {
 perror("bind");
 goto fail;
 }
-- 
2.45.0




[PATCH v4 12/12] tests/qtest/vhost-user-test: add a test case for memory-backend-shm

2024-05-08 Thread Stefano Garzarella
`memory-backend-shm` can be used with vhost-user devices, so let's
add a new test case for it.

Signed-off-by: Stefano Garzarella 
---
 tests/qtest/vhost-user-test.c | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/tests/qtest/vhost-user-test.c b/tests/qtest/vhost-user-test.c
index d4e437265f..8c1d903b2a 100644
--- a/tests/qtest/vhost-user-test.c
+++ b/tests/qtest/vhost-user-test.c
@@ -44,6 +44,8 @@
 "mem-path=%s,share=on -numa node,memdev=mem"
 #define QEMU_CMD_MEMFD  " -m %d -object memory-backend-memfd,id=mem,size=%dM," 
\
 " -numa node,memdev=mem"
+#define QEMU_CMD_SHM" -m %d -object memory-backend-shm,id=mem,size=%dM," \
+" -numa node,memdev=mem"
 #define QEMU_CMD_CHR" -chardev socket,id=%s,path=%s%s"
 #define QEMU_CMD_NETDEV " -netdev vhost-user,id=hs0,chardev=%s,vhostforce=on"
 
@@ -195,6 +197,7 @@ enum test_memfd {
 TEST_MEMFD_AUTO,
 TEST_MEMFD_YES,
 TEST_MEMFD_NO,
+TEST_MEMFD_SHM,
 };
 
 static void append_vhost_net_opts(TestServer *s, GString *cmd_line,
@@ -228,6 +231,8 @@ static void append_mem_opts(TestServer *server, GString 
*cmd_line,
 
 if (memfd == TEST_MEMFD_YES) {
 g_string_append_printf(cmd_line, QEMU_CMD_MEMFD, size, size);
+} else if (memfd == TEST_MEMFD_SHM) {
+g_string_append_printf(cmd_line, QEMU_CMD_SHM, size, size);
 } else {
 const char *root = init_hugepagefs() ? : server->tmpfs;
 
@@ -788,6 +793,19 @@ static void *vhost_user_test_setup_memfd(GString 
*cmd_line, void *arg)
 return server;
 }
 
+static void *vhost_user_test_setup_shm(GString *cmd_line, void *arg)
+{
+TestServer *server = test_server_new("vhost-user-test", arg);
+test_server_listen(server);
+
+append_mem_opts(server, cmd_line, 256, TEST_MEMFD_SHM);
+server->vu_ops->append_opts(server, cmd_line, "");
+
+g_test_queue_destroy(vhost_user_test_cleanup, server);
+
+return server;
+}
+
 static void test_read_guest_mem(void *obj, void *arg, QGuestAllocator *alloc)
 {
 TestServer *server = arg;
@@ -1081,6 +1099,11 @@ static void register_vhost_user_test(void)
  "virtio-net",
  test_read_guest_mem, );
 
+opts.before = vhost_user_test_setup_shm;
+qos_add_test("vhost-user/read-guest-mem/shm",
+ "virtio-net",
+ test_read_guest_mem, );
+
 if (qemu_memfd_check(MFD_ALLOW_SEALING)) {
 opts.before = vhost_user_test_setup_memfd;
 qos_add_test("vhost-user/read-guest-mem/memfd",
-- 
2.45.0




[PATCH v4 08/12] libvhost-user: enable it on any POSIX system

2024-05-08 Thread Stefano Garzarella
The vhost-user protocol is not really Linux-specific so let's enable
libvhost-user for any POSIX system.

Compiling it on macOS and FreeBSD some problems came up:
- avoid to include linux/vhost.h which is avaibale only on Linux
  (vhost_types.h contains many of the things we need)
- macOS doesn't provide sys/endian.h, so let's define them
  (note: libvhost-user doesn't include qemu's headers, so we can't use
   use "qemu/bswap.h")
- define eventfd_[write|read] as write/read wrapper when system doesn't
  provide those (e.g. macOS)
- copy SEAL defines from include/qemu/memfd.h to make the code works
  on FreeBSD where MFD_ALLOW_SEALING is defined
- define MAP_NORESERVE if it's not defined (e.g. on FreeBSD)

Signed-off-by: Stefano Garzarella 
---
 meson.build   |  2 +-
 subprojects/libvhost-user/libvhost-user.h |  2 +-
 subprojects/libvhost-user/libvhost-user.c | 60 +--
 3 files changed, 59 insertions(+), 5 deletions(-)

diff --git a/meson.build b/meson.build
index 955921dcb8..7954da5971 100644
--- a/meson.build
+++ b/meson.build
@@ -3168,7 +3168,7 @@ endif
 config_host_data.set('CONFIG_FDT', fdt.found())
 
 vhost_user = not_found
-if host_os == 'linux' and have_vhost_user
+if have_vhost_user
   libvhost_user = subproject('libvhost-user')
   vhost_user = libvhost_user.get_variable('vhost_user_dep')
 endif
diff --git a/subprojects/libvhost-user/libvhost-user.h 
b/subprojects/libvhost-user/libvhost-user.h
index deb40e77b3..e13e1d3931 100644
--- a/subprojects/libvhost-user/libvhost-user.h
+++ b/subprojects/libvhost-user/libvhost-user.h
@@ -18,9 +18,9 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include "standard-headers/linux/virtio_ring.h"
+#include "standard-headers/linux/vhost_types.h"
 
 /* Based on qemu/hw/virtio/vhost-user.c */
 #define VHOST_USER_F_PROTOCOL_FEATURES 30
diff --git a/subprojects/libvhost-user/libvhost-user.c 
b/subprojects/libvhost-user/libvhost-user.c
index 1c361ffd51..03edb4bf64 100644
--- a/subprojects/libvhost-user/libvhost-user.c
+++ b/subprojects/libvhost-user/libvhost-user.c
@@ -28,9 +28,7 @@
 #include 
 #include 
 #include 
-#include 
 #include 
-#include 
 
 /* Necessary to provide VIRTIO_F_VERSION_1 on system
  * with older linux headers. Must appear before
@@ -39,8 +37,8 @@
 #include "standard-headers/linux/virtio_config.h"
 
 #if defined(__linux__)
+#include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -52,6 +50,62 @@
 
 #endif
 
+#if defined(__APPLE__) && (__MACH__)
+#include 
+#define htobe16(x) OSSwapHostToBigInt16(x)
+#define htole16(x) OSSwapHostToLittleInt16(x)
+#define be16toh(x) OSSwapBigToHostInt16(x)
+#define le16toh(x) OSSwapLittleToHostInt16(x)
+
+#define htobe32(x) OSSwapHostToBigInt32(x)
+#define htole32(x) OSSwapHostToLittleInt32(x)
+#define be32toh(x) OSSwapBigToHostInt32(x)
+#define le32toh(x) OSSwapLittleToHostInt32(x)
+
+#define htobe64(x) OSSwapHostToBigInt64(x)
+#define htole64(x) OSSwapHostToLittleInt64(x)
+#define be64toh(x) OSSwapBigToHostInt64(x)
+#define le64toh(x) OSSwapLittleToHostInt64(x)
+#endif
+
+#ifdef CONFIG_EVENTFD
+#include 
+#else
+#define eventfd_t uint64_t
+
+int eventfd_write(int fd, eventfd_t value)
+{
+return (write(fd, , sizeof(value)) == sizeof(value)) ? 0 : -1;
+}
+
+int eventfd_read(int fd, eventfd_t *value)
+{
+return (read(fd, value, sizeof(*value)) == sizeof(*value)) ? 0 : -1;
+}
+#endif
+
+#ifdef MFD_ALLOW_SEALING
+#include 
+
+#ifndef F_LINUX_SPECIFIC_BASE
+#define F_LINUX_SPECIFIC_BASE 1024
+#endif
+
+#ifndef F_ADD_SEALS
+#define F_ADD_SEALS (F_LINUX_SPECIFIC_BASE + 9)
+#define F_GET_SEALS (F_LINUX_SPECIFIC_BASE + 10)
+
+#define F_SEAL_SEAL 0x0001  /* prevent further seals from being set */
+#define F_SEAL_SHRINK   0x0002  /* prevent file from shrinking */
+#define F_SEAL_GROW 0x0004  /* prevent file from growing */
+#define F_SEAL_WRITE0x0008  /* prevent writes */
+#endif
+#endif
+
+#ifndef MAP_NORESERVE
+#define MAP_NORESERVE 0
+#endif
+
 #include "include/atomic.h"
 
 #include "libvhost-user.h"
-- 
2.45.0




[PATCH v4 03/12] libvhost-user: mask F_INFLIGHT_SHMFD if memfd is not supported

2024-05-08 Thread Stefano Garzarella
libvhost-user will panic when receiving VHOST_USER_GET_INFLIGHT_FD
message if MFD_ALLOW_SEALING is not defined, since it's not able
to create a memfd.

VHOST_USER_GET_INFLIGHT_FD is used only if
VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD is negotiated. So, let's mask
that feature if the backend is not able to properly handle these
messages.

Signed-off-by: Stefano Garzarella 
---
 subprojects/libvhost-user/libvhost-user.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/subprojects/libvhost-user/libvhost-user.c 
b/subprojects/libvhost-user/libvhost-user.c
index a11afd1960..1c361ffd51 100644
--- a/subprojects/libvhost-user/libvhost-user.c
+++ b/subprojects/libvhost-user/libvhost-user.c
@@ -1674,6 +1674,16 @@ vu_get_protocol_features_exec(VuDev *dev, VhostUserMsg 
*vmsg)
 features |= dev->iface->get_protocol_features(dev);
 }
 
+/*
+ * If MFD_ALLOW_SEALING is not defined, we are not able to handle
+ * VHOST_USER_GET_INFLIGHT_FD messages, since we can't create a memfd.
+ * Those messages are used only if VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD
+ * is negotiated. A device implementation can enable it, so let's mask
+ * it to avoid a runtime panic.
+ */
+#ifndef MFD_ALLOW_SEALING
+features &= ~(1ULL << VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD);
+#endif
 vmsg_set_reply_u64(vmsg, features);
 return true;
 }
-- 
2.45.0




[PATCH v4 07/12] vhost-user: enable frontends on any POSIX system

2024-05-08 Thread Stefano Garzarella
The vhost-user protocol is not really Linux-specific so let's enable
vhost-user frontends for any POSIX system.

In vhost_net.c we use VHOST_FILE_UNBIND which is defined in a Linux
specific header, let's define it for other systems as well.

Signed-off-by: Stefano Garzarella 
---
 meson.build| 1 -
 hw/net/vhost_net.c | 5 +
 hw/block/Kconfig   | 2 +-
 3 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/meson.build b/meson.build
index 43da492372..955921dcb8 100644
--- a/meson.build
+++ b/meson.build
@@ -151,7 +151,6 @@ have_tpm = get_option('tpm') \
 
 # vhost
 have_vhost_user = get_option('vhost_user') \
-  .disable_auto_if(host_os != 'linux') \
   .require(host_os != 'windows',
error_message: 'vhost-user is not available on Windows').allowed()
 have_vhost_vdpa = get_option('vhost_vdpa') \
diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index fd1a93701a..fced429813 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -34,8 +34,13 @@
 #include "standard-headers/linux/virtio_ring.h"
 #include "hw/virtio/vhost.h"
 #include "hw/virtio/virtio-bus.h"
+#if defined(__linux__)
 #include "linux-headers/linux/vhost.h"
+#endif
 
+#ifndef VHOST_FILE_UNBIND
+#define VHOST_FILE_UNBIND -1
+#endif
 
 /* Features supported by host kernel. */
 static const int kernel_feature_bits[] = {
diff --git a/hw/block/Kconfig b/hw/block/Kconfig
index 9e8f28f982..29ee09e434 100644
--- a/hw/block/Kconfig
+++ b/hw/block/Kconfig
@@ -40,7 +40,7 @@ config VHOST_USER_BLK
 bool
 # Only PCI devices are provided for now
 default y if VIRTIO_PCI
-depends on VIRTIO && VHOST_USER && LINUX
+depends on VIRTIO && VHOST_USER
 
 config SWIM
 bool
-- 
2.45.0




[PATCH v4 09/12] contrib/vhost-user-blk: enable it on any POSIX system

2024-05-08 Thread Stefano Garzarella
Let's make the code more portable by adding defines from
block/file-posix.c to support O_DIRECT in other systems (e.g. macOS).

vhost-user-server.c is a dependency, let's enable it for any POSIX
system.

Signed-off-by: Stefano Garzarella 
---
v4:
- moved using of "qemu/bswap.h" API in a separate patch [Phil]
---
 meson.build |  2 --
 contrib/vhost-user-blk/vhost-user-blk.c | 14 ++
 util/meson.build|  4 +++-
 3 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/meson.build b/meson.build
index 7954da5971..25047db3c1 100644
--- a/meson.build
+++ b/meson.build
@@ -1960,8 +1960,6 @@ has_statx = cc.has_header_symbol('sys/stat.h', 
'STATX_BASIC_STATS', prefix: gnu_
 has_statx_mnt_id = cc.has_header_symbol('sys/stat.h', 'STATX_MNT_ID', prefix: 
gnu_source_prefix)
 
 have_vhost_user_blk_server = get_option('vhost_user_blk_server') \
-  .require(host_os == 'linux',
-   error_message: 'vhost_user_blk_server requires linux') \
   .require(have_vhost_user,
error_message: 'vhost_user_blk_server requires vhost-user support') 
\
   .disable_auto_if(not have_tools and not have_system) \
diff --git a/contrib/vhost-user-blk/vhost-user-blk.c 
b/contrib/vhost-user-blk/vhost-user-blk.c
index 9492146855..a450337685 100644
--- a/contrib/vhost-user-blk/vhost-user-blk.c
+++ b/contrib/vhost-user-blk/vhost-user-blk.c
@@ -25,6 +25,20 @@
 #include 
 #endif
 
+/* OS X does not have O_DSYNC */
+#ifndef O_DSYNC
+#ifdef O_SYNC
+#define O_DSYNC O_SYNC
+#elif defined(O_FSYNC)
+#define O_DSYNC O_FSYNC
+#endif
+#endif
+
+/* Approximate O_DIRECT with O_DSYNC if O_DIRECT isn't available */
+#ifndef O_DIRECT
+#define O_DIRECT O_DSYNC
+#endif
+
 enum {
 VHOST_USER_BLK_MAX_QUEUES = 8,
 };
diff --git a/util/meson.build b/util/meson.build
index 2ad57b10ba..93054f2340 100644
--- a/util/meson.build
+++ b/util/meson.build
@@ -112,10 +112,12 @@ if have_block
 util_ss.add(files('filemonitor-stub.c'))
   endif
   if host_os == 'linux'
-util_ss.add(files('vhost-user-server.c'), vhost_user)
 util_ss.add(files('vfio-helpers.c'))
 util_ss.add(files('chardev_open.c'))
   endif
+  if host_os != 'windows'
+util_ss.add(files('vhost-user-server.c'), vhost_user)
+  endif
   util_ss.add(files('yank.c'))
 endif
 
-- 
2.45.0




[PATCH v4 06/12] contrib/vhost-user-*: use QEMU bswap helper functions

2024-05-08 Thread Stefano Garzarella
Let's replace the calls to le*toh() and htole*() with qemu/bswap.h
helpers to make the code more portable.

Suggested-by: Philippe Mathieu-Daudé 
Signed-off-by: Stefano Garzarella 
---
 contrib/vhost-user-blk/vhost-user-blk.c |  9 +
 contrib/vhost-user-input/main.c | 16 
 2 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/contrib/vhost-user-blk/vhost-user-blk.c 
b/contrib/vhost-user-blk/vhost-user-blk.c
index a8ab9269a2..9492146855 100644
--- a/contrib/vhost-user-blk/vhost-user-blk.c
+++ b/contrib/vhost-user-blk/vhost-user-blk.c
@@ -16,6 +16,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/bswap.h"
 #include "standard-headers/linux/virtio_blk.h"
 #include "libvhost-user-glib.h"
 
@@ -194,8 +195,8 @@ vub_discard_write_zeroes(VubReq *req, struct iovec *iov, 
uint32_t iovcnt,
 #if defined(__linux__) && defined(BLKDISCARD) && defined(BLKZEROOUT)
 VubDev *vdev_blk = req->vdev_blk;
 desc = buf;
-uint64_t range[2] = { le64toh(desc->sector) << 9,
-  le32toh(desc->num_sectors) << 9 };
+uint64_t range[2] = { le64_to_cpu(desc->sector) << 9,
+  le32_to_cpu(desc->num_sectors) << 9 };
 if (type == VIRTIO_BLK_T_DISCARD) {
 if (ioctl(vdev_blk->blk_fd, BLKDISCARD, range) == 0) {
 g_free(buf);
@@ -267,13 +268,13 @@ static int vub_virtio_process_req(VubDev *vdev_blk,
 req->in = (struct virtio_blk_inhdr *)elem->in_sg[in_num - 1].iov_base;
 in_num--;
 
-type = le32toh(req->out->type);
+type = le32_to_cpu(req->out->type);
 switch (type & ~VIRTIO_BLK_T_BARRIER) {
 case VIRTIO_BLK_T_IN:
 case VIRTIO_BLK_T_OUT: {
 ssize_t ret = 0;
 bool is_write = type & VIRTIO_BLK_T_OUT;
-req->sector_num = le64toh(req->out->sector);
+req->sector_num = le64_to_cpu(req->out->sector);
 if (is_write) {
 ret  = vub_writev(req, >out_sg[1], out_num);
 } else {
diff --git a/contrib/vhost-user-input/main.c b/contrib/vhost-user-input/main.c
index 081230da54..f3362d41ac 100644
--- a/contrib/vhost-user-input/main.c
+++ b/contrib/vhost-user-input/main.c
@@ -51,8 +51,8 @@ static void vi_input_send(VuInput *vi, struct 
virtio_input_event *event)
 vi->queue[vi->qindex++].event = *event;
 
 /* ... until we see a report sync ... */
-if (event->type != htole16(EV_SYN) ||
-event->code != htole16(SYN_REPORT)) {
+if (event->type != cpu_to_le16(EV_SYN) ||
+event->code != cpu_to_le16(SYN_REPORT)) {
 return;
 }
 
@@ -103,9 +103,9 @@ vi_evdev_watch(VuDev *dev, int condition, void *data)
 
 g_debug("input %d %d %d", evdev.type, evdev.code, evdev.value);
 
-virtio.type  = htole16(evdev.type);
-virtio.code  = htole16(evdev.code);
-virtio.value = htole32(evdev.value);
+virtio.type  = cpu_to_le16(evdev.type);
+virtio.code  = cpu_to_le16(evdev.code);
+virtio.value = cpu_to_le32(evdev.value);
 vi_input_send(vi, );
 }
 }
@@ -124,9 +124,9 @@ static void vi_handle_status(VuInput *vi, 
virtio_input_event *event)
 
 evdev.input_event_sec = tval.tv_sec;
 evdev.input_event_usec = tval.tv_usec;
-evdev.type = le16toh(event->type);
-evdev.code = le16toh(event->code);
-evdev.value = le32toh(event->value);
+evdev.type = le16_to_cpu(event->type);
+evdev.code = le16_to_cpu(event->code);
+evdev.value = le32_to_cpu(event->value);
 
 rc = write(vi->evdevfd, , sizeof(evdev));
 if (rc == -1) {
-- 
2.45.0




[PATCH v4 01/12] libvhost-user: set msg.msg_control to NULL when it is empty

2024-05-08 Thread Stefano Garzarella
On some OS (e.g. macOS) sendmsg() returns -1 (errno EINVAL) if
the `struct msghdr` has the field `msg_controllen` set to 0, but
`msg_control` is not NULL.

Reviewed-by: Eric Blake 
Reviewed-by: David Hildenbrand 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Stefano Garzarella 
---
 subprojects/libvhost-user/libvhost-user.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/subprojects/libvhost-user/libvhost-user.c 
b/subprojects/libvhost-user/libvhost-user.c
index a879149fef..22bea0c775 100644
--- a/subprojects/libvhost-user/libvhost-user.c
+++ b/subprojects/libvhost-user/libvhost-user.c
@@ -632,6 +632,7 @@ vu_message_write(VuDev *dev, int conn_fd, VhostUserMsg 
*vmsg)
 memcpy(CMSG_DATA(cmsg), vmsg->fds, fdsize);
 } else {
 msg.msg_controllen = 0;
+msg.msg_control = NULL;
 }
 
 do {
-- 
2.45.0




[PATCH v4 00/12] vhost-user: support any POSIX system (tested on macOS, FreeBSD, OpenBSD)

2024-05-08 Thread Stefano Garzarella
v1: https://patchew.org/QEMU/20240228114759.44758-1-sgarz...@redhat.com/
v2: https://patchew.org/QEMU/20240326133936.125332-1-sgarz...@redhat.com/
v3: https://patchew.org/QEMU/20240404122330.92710-1-sgarz...@redhat.com/
v4:
  - rebased on master (commit e116b92d01c2cd75957a9f8ad1d4932292867b81)
  - added patch 6 to move using QEMU bswap helper functions in a separate
patch (Phil)
  - fail if we find "share=off" in shm_backend_memory_alloc() (David)
  - added Phil's R-b and David's A-b

The vhost-user protocol is not really Linux-specific, so let's try support
QEMU's frontends and backends (including libvhost-user) in any POSIX system
with this series. The main use case is to be able to use virtio devices that
we don't have built-in in QEMU (e.g. virtiofsd, vhost-user-vsock, etc.) even
in non-Linux systems.

The first 5 patches are more like fixes discovered at runtime on macOS or
FreeBSD that could go even independently of this series.

Patches 6, 7, 8, and 9 enable building of frontends and backends (including
libvhost-user) with associated code changes to succeed in compilation.

Patch 10 adds `memory-backend-shm` that uses the POSIX shm_open() API to
create shared memory which is identified by an fd that can be shared with
vhost-user backends. This is useful on those systems (like macOS) where
we don't have memfd_create() or special filesystems like "/dev/shm".

Patches 11 and 12 use `memory-backend-shm` in some vhost-user tests.

Maybe the first 5 patches can go separately, but I only discovered those
problems after testing patches 6 - 9, so I have included them in this series
for now. Please let me know if you prefer that I send them separately.

I tested this series using vhost-user-blk and QSD on macOS Sonoma 14.4
(aarch64), FreeBSD 14 (x86_64), OpenBSD 7.4 (x86_64), and Fedora 39 (x86_64)
in this way:

- Start vhost-user-blk or QSD (same commands for all systems)

  vhost-user-blk -s /tmp/vhost.socket \
-b Fedora-Cloud-Base-39-1.5.x86_64.raw

  qemu-storage-daemon \
--blockdev 
file,filename=Fedora-Cloud-Base-39-1.5.x86_64.qcow2,node-name=file \
--blockdev qcow2,file=file,node-name=qcow2 \
--export 
vhost-user-blk,addr.type=unix,addr.path=/tmp/vhost.socket,id=vub,num-queues=1,node-name=qcow2,writable=on

- macOS (aarch64): start QEMU (using hvf accelerator)

  qemu-system-aarch64 -smp 2 -cpu host -M virt,accel=hvf,memory-backend=mem \
-drive 
file=./build/pc-bios/edk2-aarch64-code.fd,if=pflash,format=raw,readonly=on \
-device virtio-net-device,netdev=net0 -netdev user,id=net0 \
-device ramfb -device usb-ehci -device usb-kbd \
-object memory-backend-shm,id=mem,size=512M \
-device vhost-user-blk-pci,num-queues=1,disable-legacy=on,chardev=char0 \
-chardev socket,id=char0,path=/tmp/vhost.socket

- FreeBSD/OpenBSD (x86_64): start QEMU (no accelerators available)

  qemu-system-x86_64 -smp 2 -M q35,memory-backend=mem \
-object memory-backend-shm,id=mem,size="512M" \
-device vhost-user-blk-pci,num-queues=1,chardev=char0 \
-chardev socket,id=char0,path=/tmp/vhost.socket

- Fedora (x86_64): start QEMU (using kvm accelerator)

  qemu-system-x86_64 -smp 2 -M q35,accel=kvm,memory-backend=mem \
-object memory-backend-shm,size="512M" \
-device vhost-user-blk-pci,num-queues=1,chardev=char0 \
-chardev socket,id=char0,path=/tmp/vhost.socket

Branch pushed (and CI started) at 
https://gitlab.com/sgarzarella/qemu/-/tree/macos-vhost-user?ref_type=heads

Thanks,
Stefano

Stefano Garzarella (12):
  libvhost-user: set msg.msg_control to NULL when it is empty
  libvhost-user: fail vu_message_write() if sendmsg() is failing
  libvhost-user: mask F_INFLIGHT_SHMFD if memfd is not supported
  vhost-user-server: do not set memory fd non-blocking
  contrib/vhost-user-blk: fix bind() using the right size of the address
  contrib/vhost-user-*: use QEMU bswap helper functions
  vhost-user: enable frontends on any POSIX system
  libvhost-user: enable it on any POSIX system
  contrib/vhost-user-blk: enable it on any POSIX system
  hostmem: add a new memory backend based on POSIX shm_open()
  tests/qtest/vhost-user-blk-test: use memory-backend-shm
  tests/qtest/vhost-user-test: add a test case for memory-backend-shm

 docs/system/devices/vhost-user.rst|   5 +-
 meson.build   |   5 +-
 qapi/qom.json |  17 +++
 subprojects/libvhost-user/libvhost-user.h |   2 +-
 backends/hostmem-shm.c| 123 ++
 contrib/vhost-user-blk/vhost-user-blk.c   |  27 +++--
 contrib/vhost-user-input/main.c   |  16 +--
 hw/net/vhost_net.c|   5 +
 subprojects/libvhost-user/libvhost-user.c |  76 -
 tests/qtest/vhost-user-blk-test.c |   2 +-
 tests/qtest/vhost-user-test.c |  23 
 util/vhost-user-server.c  |  12 +++
 backends/meson.build  |   1 +
 hw/bloc

Re: [PATCH for-9.1 v3 09/11] hostmem: add a new memory backend based on POSIX shm_open()

2024-04-08 Thread Stefano Garzarella

On Mon, Apr 08, 2024 at 10:03:15AM +0200, David Hildenbrand wrote:

On 08.04.24 09:58, Stefano Garzarella wrote:

On Thu, Apr 04, 2024 at 04:09:34PM +0200, David Hildenbrand wrote:

On 04.04.24 14:23, Stefano Garzarella wrote:

shm_open() creates and opens a new POSIX shared memory object.
A POSIX shared memory object allows creating memory backend with an
associated file descriptor that can be shared with external processes
(e.g. vhost-user).

The new `memory-backend-shm` can be used as an alternative when
`memory-backend-memfd` is not available (Linux only), since shm_open()
should be provided by any POSIX-compliant operating system.

This backend mimics memfd, allocating memory that is practically
anonymous. In theory shm_open() requires a name, but this is allocated
for a short time interval and shm_unlink() is called right after
shm_open(). After that, only fd is shared with external processes
(e.g., vhost-user) as if it were associated with anonymous memory.

In the future we may also allow the user to specify the name to be
passed to shm_open(), but for now we keep the backend simple, mimicking
anonymous memory such as memfd.

Signed-off-by: Stefano Garzarella 
---
v3
- enriched commit message and documentation to highlight that we
  want to mimic memfd (David)
---
 docs/system/devices/vhost-user.rst |   5 +-
 qapi/qom.json  |  17 +
 backends/hostmem-shm.c | 118 +
 backends/meson.build   |   1 +
 qemu-options.hx|  11 +++
 5 files changed, 150 insertions(+), 2 deletions(-)
 create mode 100644 backends/hostmem-shm.c

diff --git a/docs/system/devices/vhost-user.rst 
b/docs/system/devices/vhost-user.rst
index 9b2da106ce..35259d8ec7 100644
--- a/docs/system/devices/vhost-user.rst
+++ b/docs/system/devices/vhost-user.rst
@@ -98,8 +98,9 @@ Shared memory object
 In order for the daemon to access the VirtIO queues to process the
 requests it needs access to the guest's address space. This is
-achieved via the ``memory-backend-file`` or ``memory-backend-memfd``
-objects. A reference to a file-descriptor which can access this object
+achieved via the ``memory-backend-file``, ``memory-backend-memfd``, or
+``memory-backend-shm`` objects.
+A reference to a file-descriptor which can access this object
 will be passed via the socket as part of the protocol negotiation.
 Currently the shared memory object needs to match the size of the main
diff --git a/qapi/qom.json b/qapi/qom.json
index 85e6b4f84a..5252ec69e3 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -721,6 +721,19 @@
 '*hugetlbsize': 'size',
 '*seal': 'bool' } }
+##
+# @MemoryBackendShmProperties:
+#
+# Properties for memory-backend-shm objects.
+#
+# The @share boolean option is true by default with shm.
+#
+# Since: 9.1
+##
+{ 'struct': 'MemoryBackendShmProperties',
+  'base': 'MemoryBackendProperties',
+  'data': { } }
+


Acked-by: David Hildenbrand 

One comment: we should maybe just forbid setting share=off. it doesn't
make any sense and it can even result in an unexpected double memory
consumption. We missed doing that for memfd, unfortunately.


Good point!

IIUC the `share` property is defined by the parent `hostmem`, so I
should find a way to override the property here and disable the setter,
or add an option to `hostmem` to make the property non-writable.


Right, or simply fail later when you would find "share=off" in 
shm_backend_memory_alloc().


This seems like the simplest and cleanest approach, I'll go in this 
direction!




When ever supporting named shmem_open(), it could make sense for VM 
snapshotting. Right now it doesn't really make any sense.


Yeah, I see.

Thanks,
Stefano




Re: [PATCH for-9.1 v3 00/11] vhost-user: support any POSIX system (tested on macOS, FreeBSD, OpenBSD)

2024-04-08 Thread Stefano Garzarella

FYI I'll be on PTO till May 2nd, I'll send the v4 when I'm back ASAP.

Thanks,
Stefano

On Thu, Apr 04, 2024 at 02:23:19PM +0200, Stefano Garzarella wrote:

v1: https://patchew.org/QEMU/20240228114759.44758-1-sgarz...@redhat.com/
v2: https://patchew.org/QEMU/20240326133936.125332-1-sgarz...@redhat.com/
v3:
 - rebased on v9.0.0-rc2
 - patch 4: avoiding setting fd non-blocking for messages where we
   have memory fd (Eric)
 - patch 9: enriched commit message and documentation to highlight that we
   want to mimic memfd (David)

The vhost-user protocol is not really Linux-specific, so let's try support
QEMU's frontends and backends (including libvhost-user) in any POSIX system
with this series. The main use case is to be able to use virtio devices that
we don't have built-in in QEMU (e.g. virtiofsd, vhost-user-vsock, etc.) even
in non-Linux systems.

The first 5 patches are more like fixes discovered at runtime on macOS or
FreeBSD that could go even independently of this series.

Patches 6, 7, and 8 enable building of frontends and backends (including
libvhost-user) with associated code changes to succeed in compilation.

Patch 9 adds `memory-backend-shm` that uses the POSIX shm_open() API to
create shared memory which is identified by an fd that can be shared with
vhost-user backends. This is useful on those systems (like macOS) where
we don't have memfd_create() or special filesystems like "/dev/shm".

Patches 10 and 11 use `memory-backend-shm` in some vhost-user tests.

Maybe the first 5 patches can go separately, but I only discovered those
problems after testing patches 6 - 9, so I have included them in this series
for now. Please let me know if you prefer that I send them separately.

I tested this series using vhost-user-blk and QSD on macOS Sonoma 14.4
(aarch64), FreeBSD 14 (x86_64), OpenBSD 7.4 (x86_64), and Fedora 39 (x86_64)
in this way:

- Start vhost-user-blk or QSD (same commands for all systems)

 vhost-user-blk -s /tmp/vhost.socket \
   -b Fedora-Cloud-Base-39-1.5.x86_64.raw

 qemu-storage-daemon \
   --blockdev 
file,filename=Fedora-Cloud-Base-39-1.5.x86_64.qcow2,node-name=file \
   --blockdev qcow2,file=file,node-name=qcow2 \
   --export 
vhost-user-blk,addr.type=unix,addr.path=/tmp/vhost.socket,id=vub,num-queues=1,node-name=qcow2,writable=on

- macOS (aarch64): start QEMU (using hvf accelerator)

 qemu-system-aarch64 -smp 2 -cpu host -M virt,accel=hvf,memory-backend=mem \
   -drive 
file=./build/pc-bios/edk2-aarch64-code.fd,if=pflash,format=raw,readonly=on \
   -device virtio-net-device,netdev=net0 -netdev user,id=net0 \
   -device ramfb -device usb-ehci -device usb-kbd \
   -object memory-backend-shm,id=mem,size=512M \
   -device vhost-user-blk-pci,num-queues=1,disable-legacy=on,chardev=char0 \
   -chardev socket,id=char0,path=/tmp/vhost.socket

- FreeBSD/OpenBSD (x86_64): start QEMU (no accelerators available)

 qemu-system-x86_64 -smp 2 -M q35,memory-backend=mem \
   -object memory-backend-shm,id=mem,size="512M" \
   -device vhost-user-blk-pci,num-queues=1,chardev=char0 \
   -chardev socket,id=char0,path=/tmp/vhost.socket

- Fedora (x86_64): start QEMU (using kvm accelerator)

 qemu-system-x86_64 -smp 2 -M q35,accel=kvm,memory-backend=mem \
   -object memory-backend-shm,size="512M" \
   -device vhost-user-blk-pci,num-queues=1,chardev=char0 \
   -chardev socket,id=char0,path=/tmp/vhost.socket

Branch pushed (and CI started) at 
https://gitlab.com/sgarzarella/qemu/-/tree/macos-vhost-user?ref_type=heads

Thanks,
Stefano

Stefano Garzarella (11):
 libvhost-user: set msg.msg_control to NULL when it is empty
 libvhost-user: fail vu_message_write() if sendmsg() is failing
 libvhost-user: mask F_INFLIGHT_SHMFD if memfd is not supported
 vhost-user-server: do not set memory fd non-blocking
 contrib/vhost-user-blk: fix bind() using the right size of the address
 vhost-user: enable frontends on any POSIX system
 libvhost-user: enable it on any POSIX system
 contrib/vhost-user-blk: enable it on any POSIX system
 hostmem: add a new memory backend based on POSIX shm_open()
 tests/qtest/vhost-user-blk-test: use memory-backend-shm
 tests/qtest/vhost-user-test: add a test case for memory-backend-shm

docs/system/devices/vhost-user.rst|   5 +-
meson.build   |   5 +-
qapi/qom.json |  17 
subprojects/libvhost-user/libvhost-user.h |   2 +-
backends/hostmem-shm.c| 118 ++
contrib/vhost-user-blk/vhost-user-blk.c   |  23 -
hw/net/vhost_net.c|   5 +
subprojects/libvhost-user/libvhost-user.c |  76 +-
tests/qtest/vhost-user-blk-test.c |   2 +-
tests/qtest/vhost-user-test.c |  23 +
util/vhost-user-server.c  |  12 +++
backends/meson.build  |   1 +
hw/block/Kconfig  |   2 +-
qemu-options.hx   |  11 ++
util/meson.build 

Re: [PATCH for-9.1 v3 09/11] hostmem: add a new memory backend based on POSIX shm_open()

2024-04-08 Thread Stefano Garzarella

On Thu, Apr 04, 2024 at 04:09:34PM +0200, David Hildenbrand wrote:

On 04.04.24 14:23, Stefano Garzarella wrote:

shm_open() creates and opens a new POSIX shared memory object.
A POSIX shared memory object allows creating memory backend with an
associated file descriptor that can be shared with external processes
(e.g. vhost-user).

The new `memory-backend-shm` can be used as an alternative when
`memory-backend-memfd` is not available (Linux only), since shm_open()
should be provided by any POSIX-compliant operating system.

This backend mimics memfd, allocating memory that is practically
anonymous. In theory shm_open() requires a name, but this is allocated
for a short time interval and shm_unlink() is called right after
shm_open(). After that, only fd is shared with external processes
(e.g., vhost-user) as if it were associated with anonymous memory.

In the future we may also allow the user to specify the name to be
passed to shm_open(), but for now we keep the backend simple, mimicking
anonymous memory such as memfd.

Signed-off-by: Stefano Garzarella 
---
v3
- enriched commit message and documentation to highlight that we
  want to mimic memfd (David)
---
 docs/system/devices/vhost-user.rst |   5 +-
 qapi/qom.json  |  17 +
 backends/hostmem-shm.c | 118 +
 backends/meson.build   |   1 +
 qemu-options.hx|  11 +++
 5 files changed, 150 insertions(+), 2 deletions(-)
 create mode 100644 backends/hostmem-shm.c

diff --git a/docs/system/devices/vhost-user.rst 
b/docs/system/devices/vhost-user.rst
index 9b2da106ce..35259d8ec7 100644
--- a/docs/system/devices/vhost-user.rst
+++ b/docs/system/devices/vhost-user.rst
@@ -98,8 +98,9 @@ Shared memory object
 In order for the daemon to access the VirtIO queues to process the
 requests it needs access to the guest's address space. This is
-achieved via the ``memory-backend-file`` or ``memory-backend-memfd``
-objects. A reference to a file-descriptor which can access this object
+achieved via the ``memory-backend-file``, ``memory-backend-memfd``, or
+``memory-backend-shm`` objects.
+A reference to a file-descriptor which can access this object
 will be passed via the socket as part of the protocol negotiation.
 Currently the shared memory object needs to match the size of the main
diff --git a/qapi/qom.json b/qapi/qom.json
index 85e6b4f84a..5252ec69e3 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -721,6 +721,19 @@
 '*hugetlbsize': 'size',
 '*seal': 'bool' } }
+##
+# @MemoryBackendShmProperties:
+#
+# Properties for memory-backend-shm objects.
+#
+# The @share boolean option is true by default with shm.
+#
+# Since: 9.1
+##
+{ 'struct': 'MemoryBackendShmProperties',
+  'base': 'MemoryBackendProperties',
+  'data': { } }
+


Acked-by: David Hildenbrand 

One comment: we should maybe just forbid setting share=off. it doesn't 
make any sense and it can even result in an unexpected double memory 
consumption. We missed doing that for memfd, unfortunately.


Good point!

IIUC the `share` property is defined by the parent `hostmem`, so I 
should find a way to override the property here and disable the setter, 
or add an option to `hostmem` to make the property non-writable.


Thanks,
Stefano




Re: [PATCH for-9.1 v3 08/11] contrib/vhost-user-blk: enable it on any POSIX system

2024-04-08 Thread Stefano Garzarella

On Thu, Apr 04, 2024 at 04:00:38PM +0200, Philippe Mathieu-Daudé wrote:

Hi Stefano,


Hi Phil!



On 4/4/24 14:23, Stefano Garzarella wrote:

Let's make the code more portable by using the "qemu/bswap.h" API
and adding defines from block/file-posix.c to support O_DIRECT in
other systems (e.g. macOS).

vhost-user-server.c is a dependency, let's enable it for any POSIX
system.

Signed-off-by: Stefano Garzarella 
---
 meson.build |  2 --
 contrib/vhost-user-blk/vhost-user-blk.c | 19 +--
 util/meson.build|  4 +++-
 3 files changed, 20 insertions(+), 5 deletions(-)




diff --git a/contrib/vhost-user-blk/vhost-user-blk.c 
b/contrib/vhost-user-blk/vhost-user-blk.c
index a8ab9269a2..462e584857 100644
--- a/contrib/vhost-user-blk/vhost-user-blk.c
+++ b/contrib/vhost-user-blk/vhost-user-blk.c
@@ -16,6 +16,7 @@
  */
 #include "qemu/osdep.h"
+#include "qemu/bswap.h"
 #include "standard-headers/linux/virtio_blk.h"
 #include "libvhost-user-glib.h"




@@ -267,13 +282,13 @@ static int vub_virtio_process_req(VubDev *vdev_blk,
 req->in = (struct virtio_blk_inhdr *)elem->in_sg[in_num - 1].iov_base;
 in_num--;
-type = le32toh(req->out->type);
+type = le32_to_cpu(req->out->type);
 switch (type & ~VIRTIO_BLK_T_BARRIER) {
 case VIRTIO_BLK_T_IN:
 case VIRTIO_BLK_T_OUT: {
 ssize_t ret = 0;
 bool is_write = type & VIRTIO_BLK_T_OUT;
-req->sector_num = le64toh(req->out->sector);
+req->sector_num = le64_to_cpu(req->out->sector);
 if (is_write) {
 ret  = vub_writev(req, >out_sg[1], out_num);
 } else {

Can we switch to the bswap API in a preliminary patch,


Sure, I tried to minimize the patches because it's already big,
but I can split this.


converting all the source files?



What do you mean with "all the source files"?

"le64toh" is used here and in some subprojects (e.g. libvduse,
libvhost-user), where IIUC we can't use QEMU's bswap.h because we
don't want to put a dependency with the QEMU code.

BTW I'll check for other *toh() usage in QEMU code and change in the
preliminary patch you suggested to add.

Thanks for the review,
Stefano




[PATCH for-9.1 v3 08/11] contrib/vhost-user-blk: enable it on any POSIX system

2024-04-04 Thread Stefano Garzarella
Let's make the code more portable by using the "qemu/bswap.h" API
and adding defines from block/file-posix.c to support O_DIRECT in
other systems (e.g. macOS).

vhost-user-server.c is a dependency, let's enable it for any POSIX
system.

Signed-off-by: Stefano Garzarella 
---
 meson.build |  2 --
 contrib/vhost-user-blk/vhost-user-blk.c | 19 +--
 util/meson.build|  4 +++-
 3 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/meson.build b/meson.build
index 3197a2f62e..b541e5c875 100644
--- a/meson.build
+++ b/meson.build
@@ -1956,8 +1956,6 @@ has_statx = cc.has_header_symbol('sys/stat.h', 
'STATX_BASIC_STATS', prefix: gnu_
 has_statx_mnt_id = cc.has_header_symbol('sys/stat.h', 'STATX_MNT_ID', prefix: 
gnu_source_prefix)
 
 have_vhost_user_blk_server = get_option('vhost_user_blk_server') \
-  .require(host_os == 'linux',
-   error_message: 'vhost_user_blk_server requires linux') \
   .require(have_vhost_user,
error_message: 'vhost_user_blk_server requires vhost-user support') 
\
   .disable_auto_if(not have_tools and not have_system) \
diff --git a/contrib/vhost-user-blk/vhost-user-blk.c 
b/contrib/vhost-user-blk/vhost-user-blk.c
index a8ab9269a2..462e584857 100644
--- a/contrib/vhost-user-blk/vhost-user-blk.c
+++ b/contrib/vhost-user-blk/vhost-user-blk.c
@@ -16,6 +16,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/bswap.h"
 #include "standard-headers/linux/virtio_blk.h"
 #include "libvhost-user-glib.h"
 
@@ -24,6 +25,20 @@
 #include 
 #endif
 
+/* OS X does not have O_DSYNC */
+#ifndef O_DSYNC
+#ifdef O_SYNC
+#define O_DSYNC O_SYNC
+#elif defined(O_FSYNC)
+#define O_DSYNC O_FSYNC
+#endif
+#endif
+
+/* Approximate O_DIRECT with O_DSYNC if O_DIRECT isn't available */
+#ifndef O_DIRECT
+#define O_DIRECT O_DSYNC
+#endif
+
 enum {
 VHOST_USER_BLK_MAX_QUEUES = 8,
 };
@@ -267,13 +282,13 @@ static int vub_virtio_process_req(VubDev *vdev_blk,
 req->in = (struct virtio_blk_inhdr *)elem->in_sg[in_num - 1].iov_base;
 in_num--;
 
-type = le32toh(req->out->type);
+type = le32_to_cpu(req->out->type);
 switch (type & ~VIRTIO_BLK_T_BARRIER) {
 case VIRTIO_BLK_T_IN:
 case VIRTIO_BLK_T_OUT: {
 ssize_t ret = 0;
 bool is_write = type & VIRTIO_BLK_T_OUT;
-req->sector_num = le64toh(req->out->sector);
+req->sector_num = le64_to_cpu(req->out->sector);
 if (is_write) {
 ret  = vub_writev(req, >out_sg[1], out_num);
 } else {
diff --git a/util/meson.build b/util/meson.build
index 0ef9886be0..f52682ce96 100644
--- a/util/meson.build
+++ b/util/meson.build
@@ -113,10 +113,12 @@ if have_block
 util_ss.add(files('filemonitor-stub.c'))
   endif
   if host_os == 'linux'
-util_ss.add(files('vhost-user-server.c'), vhost_user)
 util_ss.add(files('vfio-helpers.c'))
 util_ss.add(files('chardev_open.c'))
   endif
+  if host_os != 'windows'
+util_ss.add(files('vhost-user-server.c'), vhost_user)
+  endif
 endif
 
 if cpu == 'aarch64'
-- 
2.44.0




[PATCH for-9.1 v3 05/11] contrib/vhost-user-blk: fix bind() using the right size of the address

2024-04-04 Thread Stefano Garzarella
On macOS passing `-s /tmp/vhost.socket` parameter to the vhost-user-blk
application, the bind was done on `/tmp/vhost.socke` pathname,
missing the last character.

This sounds like one of the portability problems described in the
unix(7) manpage:

   Pathname sockets
   When  binding  a socket to a pathname, a few rules should
   be observed for maximum portability and ease of coding:

   •  The pathname in sun_path should be null-terminated.

   •  The length of the pathname, including the  terminating
  null byte, should not exceed the size of sun_path.

   •  The  addrlen  argument  that  describes  the enclosing
  sockaddr_un structure should have a value of at least:

  offsetof(struct sockaddr_un, sun_path) +
  strlen(addr.sun_path)+1

  or,  more  simply,  addrlen  can   be   specified   as
  sizeof(struct sockaddr_un).

So let's follow the last advice and simplify the code as well.

Signed-off-by: Stefano Garzarella 
---
 contrib/vhost-user-blk/vhost-user-blk.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/contrib/vhost-user-blk/vhost-user-blk.c 
b/contrib/vhost-user-blk/vhost-user-blk.c
index 89e5f11a64..a8ab9269a2 100644
--- a/contrib/vhost-user-blk/vhost-user-blk.c
+++ b/contrib/vhost-user-blk/vhost-user-blk.c
@@ -469,7 +469,6 @@ static int unix_sock_new(char *unix_fn)
 {
 int sock;
 struct sockaddr_un un;
-size_t len;
 
 assert(unix_fn);
 
@@ -481,10 +480,9 @@ static int unix_sock_new(char *unix_fn)
 
 un.sun_family = AF_UNIX;
 (void)snprintf(un.sun_path, sizeof(un.sun_path), "%s", unix_fn);
-len = sizeof(un.sun_family) + strlen(un.sun_path);
 
 (void)unlink(unix_fn);
-if (bind(sock, (struct sockaddr *), len) < 0) {
+if (bind(sock, (struct sockaddr *), sizeof(un)) < 0) {
 perror("bind");
 goto fail;
 }
-- 
2.44.0




[PATCH for-9.1 v3 09/11] hostmem: add a new memory backend based on POSIX shm_open()

2024-04-04 Thread Stefano Garzarella
shm_open() creates and opens a new POSIX shared memory object.
A POSIX shared memory object allows creating memory backend with an
associated file descriptor that can be shared with external processes
(e.g. vhost-user).

The new `memory-backend-shm` can be used as an alternative when
`memory-backend-memfd` is not available (Linux only), since shm_open()
should be provided by any POSIX-compliant operating system.

This backend mimics memfd, allocating memory that is practically
anonymous. In theory shm_open() requires a name, but this is allocated
for a short time interval and shm_unlink() is called right after
shm_open(). After that, only fd is shared with external processes
(e.g., vhost-user) as if it were associated with anonymous memory.

In the future we may also allow the user to specify the name to be
passed to shm_open(), but for now we keep the backend simple, mimicking
anonymous memory such as memfd.

Signed-off-by: Stefano Garzarella 
---
v3
- enriched commit message and documentation to highlight that we
  want to mimic memfd (David)
---
 docs/system/devices/vhost-user.rst |   5 +-
 qapi/qom.json  |  17 +
 backends/hostmem-shm.c | 118 +
 backends/meson.build   |   1 +
 qemu-options.hx|  11 +++
 5 files changed, 150 insertions(+), 2 deletions(-)
 create mode 100644 backends/hostmem-shm.c

diff --git a/docs/system/devices/vhost-user.rst 
b/docs/system/devices/vhost-user.rst
index 9b2da106ce..35259d8ec7 100644
--- a/docs/system/devices/vhost-user.rst
+++ b/docs/system/devices/vhost-user.rst
@@ -98,8 +98,9 @@ Shared memory object
 
 In order for the daemon to access the VirtIO queues to process the
 requests it needs access to the guest's address space. This is
-achieved via the ``memory-backend-file`` or ``memory-backend-memfd``
-objects. A reference to a file-descriptor which can access this object
+achieved via the ``memory-backend-file``, ``memory-backend-memfd``, or
+``memory-backend-shm`` objects.
+A reference to a file-descriptor which can access this object
 will be passed via the socket as part of the protocol negotiation.
 
 Currently the shared memory object needs to match the size of the main
diff --git a/qapi/qom.json b/qapi/qom.json
index 85e6b4f84a..5252ec69e3 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -721,6 +721,19 @@
 '*hugetlbsize': 'size',
 '*seal': 'bool' } }
 
+##
+# @MemoryBackendShmProperties:
+#
+# Properties for memory-backend-shm objects.
+#
+# The @share boolean option is true by default with shm.
+#
+# Since: 9.1
+##
+{ 'struct': 'MemoryBackendShmProperties',
+  'base': 'MemoryBackendProperties',
+  'data': { } }
+
 ##
 # @MemoryBackendEpcProperties:
 #
@@ -976,6 +989,8 @@
 { 'name': 'memory-backend-memfd',
   'if': 'CONFIG_LINUX' },
 'memory-backend-ram',
+{ 'name': 'memory-backend-shm',
+  'if': 'CONFIG_POSIX' },
 'pef-guest',
 { 'name': 'pr-manager-helper',
   'if': 'CONFIG_LINUX' },
@@ -1047,6 +1062,8 @@
   'memory-backend-memfd':   { 'type': 'MemoryBackendMemfdProperties',
   'if': 'CONFIG_LINUX' },
   'memory-backend-ram': 'MemoryBackendProperties',
+  'memory-backend-shm': { 'type': 'MemoryBackendShmProperties',
+  'if': 'CONFIG_POSIX' },
   'pr-manager-helper':  { 'type': 'PrManagerHelperProperties',
   'if': 'CONFIG_LINUX' },
   'qtest':  'QtestProperties',
diff --git a/backends/hostmem-shm.c b/backends/hostmem-shm.c
new file mode 100644
index 00..7595204d29
--- /dev/null
+++ b/backends/hostmem-shm.c
@@ -0,0 +1,118 @@
+/*
+ * QEMU host POSIX shared memory object backend
+ *
+ * Copyright (C) 2024 Red Hat Inc
+ *
+ * Authors:
+ *   Stefano Garzarella 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "sysemu/hostmem.h"
+#include "qapi/error.h"
+
+#define TYPE_MEMORY_BACKEND_SHM "memory-backend-shm"
+
+OBJECT_DECLARE_SIMPLE_TYPE(HostMemoryBackendShm, MEMORY_BACKEND_SHM)
+
+struct HostMemoryBackendShm {
+HostMemoryBackend parent_obj;
+};
+
+static bool
+shm_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
+{
+g_autoptr(GString) shm_name = g_string_new(NULL);
+g_autofree char *backend_name = NULL;
+uint32_t ram_flags;
+int fd, oflag;
+mode_t mode;
+
+if (!backend->size) {
+error_setg(errp, "can't create backend with size 0");
+return false;
+}
+
+/*
+ * Let's use `mode = 0` because we don't want other processes to open our
+ * memory unless we share the file descriptor with them.
+ */
+mode = 0;
+oflag = O_RDWR | O_CREAT | O_EXCL;
+backend_name = host_mem

[PATCH for-9.1 v3 01/11] libvhost-user: set msg.msg_control to NULL when it is empty

2024-04-04 Thread Stefano Garzarella
On some OS (e.g. macOS) sendmsg() returns -1 (errno EINVAL) if
the `struct msghdr` has the field `msg_controllen` set to 0, but
`msg_control` is not NULL.

Reviewed-by: Eric Blake 
Reviewed-by: David Hildenbrand 
Signed-off-by: Stefano Garzarella 
---
 subprojects/libvhost-user/libvhost-user.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/subprojects/libvhost-user/libvhost-user.c 
b/subprojects/libvhost-user/libvhost-user.c
index a879149fef..22bea0c775 100644
--- a/subprojects/libvhost-user/libvhost-user.c
+++ b/subprojects/libvhost-user/libvhost-user.c
@@ -632,6 +632,7 @@ vu_message_write(VuDev *dev, int conn_fd, VhostUserMsg 
*vmsg)
 memcpy(CMSG_DATA(cmsg), vmsg->fds, fdsize);
 } else {
 msg.msg_controllen = 0;
+msg.msg_control = NULL;
 }
 
 do {
-- 
2.44.0




[PATCH for-9.1 v3 11/11] tests/qtest/vhost-user-test: add a test case for memory-backend-shm

2024-04-04 Thread Stefano Garzarella
`memory-backend-shm` can be used with vhost-user devices, so let's
add a new test case for it.

Signed-off-by: Stefano Garzarella 
---
 tests/qtest/vhost-user-test.c | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/tests/qtest/vhost-user-test.c b/tests/qtest/vhost-user-test.c
index d4e437265f..8c1d903b2a 100644
--- a/tests/qtest/vhost-user-test.c
+++ b/tests/qtest/vhost-user-test.c
@@ -44,6 +44,8 @@
 "mem-path=%s,share=on -numa node,memdev=mem"
 #define QEMU_CMD_MEMFD  " -m %d -object memory-backend-memfd,id=mem,size=%dM," 
\
 " -numa node,memdev=mem"
+#define QEMU_CMD_SHM" -m %d -object memory-backend-shm,id=mem,size=%dM," \
+" -numa node,memdev=mem"
 #define QEMU_CMD_CHR" -chardev socket,id=%s,path=%s%s"
 #define QEMU_CMD_NETDEV " -netdev vhost-user,id=hs0,chardev=%s,vhostforce=on"
 
@@ -195,6 +197,7 @@ enum test_memfd {
 TEST_MEMFD_AUTO,
 TEST_MEMFD_YES,
 TEST_MEMFD_NO,
+TEST_MEMFD_SHM,
 };
 
 static void append_vhost_net_opts(TestServer *s, GString *cmd_line,
@@ -228,6 +231,8 @@ static void append_mem_opts(TestServer *server, GString 
*cmd_line,
 
 if (memfd == TEST_MEMFD_YES) {
 g_string_append_printf(cmd_line, QEMU_CMD_MEMFD, size, size);
+} else if (memfd == TEST_MEMFD_SHM) {
+g_string_append_printf(cmd_line, QEMU_CMD_SHM, size, size);
 } else {
 const char *root = init_hugepagefs() ? : server->tmpfs;
 
@@ -788,6 +793,19 @@ static void *vhost_user_test_setup_memfd(GString 
*cmd_line, void *arg)
 return server;
 }
 
+static void *vhost_user_test_setup_shm(GString *cmd_line, void *arg)
+{
+TestServer *server = test_server_new("vhost-user-test", arg);
+test_server_listen(server);
+
+append_mem_opts(server, cmd_line, 256, TEST_MEMFD_SHM);
+server->vu_ops->append_opts(server, cmd_line, "");
+
+g_test_queue_destroy(vhost_user_test_cleanup, server);
+
+return server;
+}
+
 static void test_read_guest_mem(void *obj, void *arg, QGuestAllocator *alloc)
 {
 TestServer *server = arg;
@@ -1081,6 +1099,11 @@ static void register_vhost_user_test(void)
  "virtio-net",
  test_read_guest_mem, );
 
+opts.before = vhost_user_test_setup_shm;
+qos_add_test("vhost-user/read-guest-mem/shm",
+ "virtio-net",
+ test_read_guest_mem, );
+
 if (qemu_memfd_check(MFD_ALLOW_SEALING)) {
 opts.before = vhost_user_test_setup_memfd;
 qos_add_test("vhost-user/read-guest-mem/memfd",
-- 
2.44.0




[PATCH for-9.1 v3 07/11] libvhost-user: enable it on any POSIX system

2024-04-04 Thread Stefano Garzarella
The vhost-user protocol is not really Linux-specific so let's enable
libvhost-user for any POSIX system.

Compiling it on macOS and FreeBSD some problems came up:
- avoid to include linux/vhost.h which is avaibale only on Linux
  (vhost_types.h contains many of the things we need)
- macOS doesn't provide sys/endian.h, so let's define them
  (note: libvhost-user doesn't include qemu's headers, so we can't use
   use "qemu/bswap.h")
- define eventfd_[write|read] as write/read wrapper when system doesn't
  provide those (e.g. macOS)
- copy SEAL defines from include/qemu/memfd.h to make the code works
  on FreeBSD where MFD_ALLOW_SEALING is defined
- define MAP_NORESERVE if it's not defined (e.g. on FreeBSD)

Signed-off-by: Stefano Garzarella 
---
 meson.build   |  2 +-
 subprojects/libvhost-user/libvhost-user.h |  2 +-
 subprojects/libvhost-user/libvhost-user.c | 60 +--
 3 files changed, 59 insertions(+), 5 deletions(-)

diff --git a/meson.build b/meson.build
index c19d51501a..3197a2f62e 100644
--- a/meson.build
+++ b/meson.build
@@ -3194,7 +3194,7 @@ endif
 config_host_data.set('CONFIG_FDT', fdt.found())
 
 vhost_user = not_found
-if host_os == 'linux' and have_vhost_user
+if have_vhost_user
   libvhost_user = subproject('libvhost-user')
   vhost_user = libvhost_user.get_variable('vhost_user_dep')
 endif
diff --git a/subprojects/libvhost-user/libvhost-user.h 
b/subprojects/libvhost-user/libvhost-user.h
index deb40e77b3..e13e1d3931 100644
--- a/subprojects/libvhost-user/libvhost-user.h
+++ b/subprojects/libvhost-user/libvhost-user.h
@@ -18,9 +18,9 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include "standard-headers/linux/virtio_ring.h"
+#include "standard-headers/linux/vhost_types.h"
 
 /* Based on qemu/hw/virtio/vhost-user.c */
 #define VHOST_USER_F_PROTOCOL_FEATURES 30
diff --git a/subprojects/libvhost-user/libvhost-user.c 
b/subprojects/libvhost-user/libvhost-user.c
index 1c361ffd51..03edb4bf64 100644
--- a/subprojects/libvhost-user/libvhost-user.c
+++ b/subprojects/libvhost-user/libvhost-user.c
@@ -28,9 +28,7 @@
 #include 
 #include 
 #include 
-#include 
 #include 
-#include 
 
 /* Necessary to provide VIRTIO_F_VERSION_1 on system
  * with older linux headers. Must appear before
@@ -39,8 +37,8 @@
 #include "standard-headers/linux/virtio_config.h"
 
 #if defined(__linux__)
+#include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -52,6 +50,62 @@
 
 #endif
 
+#if defined(__APPLE__) && (__MACH__)
+#include 
+#define htobe16(x) OSSwapHostToBigInt16(x)
+#define htole16(x) OSSwapHostToLittleInt16(x)
+#define be16toh(x) OSSwapBigToHostInt16(x)
+#define le16toh(x) OSSwapLittleToHostInt16(x)
+
+#define htobe32(x) OSSwapHostToBigInt32(x)
+#define htole32(x) OSSwapHostToLittleInt32(x)
+#define be32toh(x) OSSwapBigToHostInt32(x)
+#define le32toh(x) OSSwapLittleToHostInt32(x)
+
+#define htobe64(x) OSSwapHostToBigInt64(x)
+#define htole64(x) OSSwapHostToLittleInt64(x)
+#define be64toh(x) OSSwapBigToHostInt64(x)
+#define le64toh(x) OSSwapLittleToHostInt64(x)
+#endif
+
+#ifdef CONFIG_EVENTFD
+#include 
+#else
+#define eventfd_t uint64_t
+
+int eventfd_write(int fd, eventfd_t value)
+{
+return (write(fd, , sizeof(value)) == sizeof(value)) ? 0 : -1;
+}
+
+int eventfd_read(int fd, eventfd_t *value)
+{
+return (read(fd, value, sizeof(*value)) == sizeof(*value)) ? 0 : -1;
+}
+#endif
+
+#ifdef MFD_ALLOW_SEALING
+#include 
+
+#ifndef F_LINUX_SPECIFIC_BASE
+#define F_LINUX_SPECIFIC_BASE 1024
+#endif
+
+#ifndef F_ADD_SEALS
+#define F_ADD_SEALS (F_LINUX_SPECIFIC_BASE + 9)
+#define F_GET_SEALS (F_LINUX_SPECIFIC_BASE + 10)
+
+#define F_SEAL_SEAL 0x0001  /* prevent further seals from being set */
+#define F_SEAL_SHRINK   0x0002  /* prevent file from shrinking */
+#define F_SEAL_GROW 0x0004  /* prevent file from growing */
+#define F_SEAL_WRITE0x0008  /* prevent writes */
+#endif
+#endif
+
+#ifndef MAP_NORESERVE
+#define MAP_NORESERVE 0
+#endif
+
 #include "include/atomic.h"
 
 #include "libvhost-user.h"
-- 
2.44.0




[PATCH for-9.1 v3 04/11] vhost-user-server: do not set memory fd non-blocking

2024-04-04 Thread Stefano Garzarella
In vhost-user-server we set all fd received from the other peer
in non-blocking mode. For some of them (e.g. memfd, shm_open, etc.)
it's not really needed, because we don't use these fd with blocking
operations, but only to map memory.

In addition, in some systems this operation can fail (e.g. in macOS
setting an fd returned by shm_open() non-blocking fails with errno
= ENOTTY).

So, let's avoid setting fd non-blocking for those messages that we
know carry memory fd (e.g. VHOST_USER_ADD_MEM_REG,
VHOST_USER_SET_MEM_TABLE).

Signed-off-by: Stefano Garzarella 
---
v3:
- avoiding setting fd non-blocking for messages where we have memory fd
  (Eric)
---
 util/vhost-user-server.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
index 3bfb1ad3ec..b19229074a 100644
--- a/util/vhost-user-server.c
+++ b/util/vhost-user-server.c
@@ -65,6 +65,18 @@ static void vmsg_close_fds(VhostUserMsg *vmsg)
 static void vmsg_unblock_fds(VhostUserMsg *vmsg)
 {
 int i;
+
+/*
+ * These messages carry fd used to map memory, not to send/receive 
messages,
+ * so this operation is useless. In addition, in some systems this
+ * operation can fail (e.g. in macOS setting an fd returned by shm_open()
+ * non-blocking fails with errno = ENOTTY)
+ */
+if (vmsg->request == VHOST_USER_ADD_MEM_REG ||
+vmsg->request == VHOST_USER_SET_MEM_TABLE) {
+return;
+}
+
 for (i = 0; i < vmsg->fd_num; i++) {
 qemu_socket_set_nonblock(vmsg->fds[i]);
 }
-- 
2.44.0




[PATCH for-9.1 v3 02/11] libvhost-user: fail vu_message_write() if sendmsg() is failing

2024-04-04 Thread Stefano Garzarella
In vu_message_write() we use sendmsg() to send the message header,
then a write() to send the payload.

If sendmsg() fails we should avoid sending the payload, since we
were unable to send the header.

Discovered before fixing the issue with the previous patch, where
sendmsg() failed on macOS due to wrong parameters, but the frontend
still sent the payload which the backend incorrectly interpreted
as a wrong header.

Signed-off-by: Stefano Garzarella 
---
 subprojects/libvhost-user/libvhost-user.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/subprojects/libvhost-user/libvhost-user.c 
b/subprojects/libvhost-user/libvhost-user.c
index 22bea0c775..a11afd1960 100644
--- a/subprojects/libvhost-user/libvhost-user.c
+++ b/subprojects/libvhost-user/libvhost-user.c
@@ -639,6 +639,11 @@ vu_message_write(VuDev *dev, int conn_fd, VhostUserMsg 
*vmsg)
 rc = sendmsg(conn_fd, , 0);
 } while (rc < 0 && (errno == EINTR || errno == EAGAIN));
 
+if (rc <= 0) {
+vu_panic(dev, "Error while writing: %s", strerror(errno));
+return false;
+}
+
 if (vmsg->size) {
 do {
 if (vmsg->data) {
-- 
2.44.0




[PATCH for-9.1 v3 10/11] tests/qtest/vhost-user-blk-test: use memory-backend-shm

2024-04-04 Thread Stefano Garzarella
`memory-backend-memfd` is available only on Linux while the new
`memory-backend-shm` can be used on any POSIX-compliant operating
system. Let's use it so we can run the test in multiple environments.

Signed-off-by: Stefano Garzarella 
---
 tests/qtest/vhost-user-blk-test.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qtest/vhost-user-blk-test.c 
b/tests/qtest/vhost-user-blk-test.c
index 117b9acd10..e945f6abf2 100644
--- a/tests/qtest/vhost-user-blk-test.c
+++ b/tests/qtest/vhost-user-blk-test.c
@@ -906,7 +906,7 @@ static void start_vhost_user_blk(GString *cmd_line, int 
vus_instances,
vhost_user_blk_bin);
 
 g_string_append_printf(cmd_line,
-" -object memory-backend-memfd,id=mem,size=256M,share=on "
+" -object memory-backend-shm,id=mem,size=256M,share=on "
 " -M memory-backend=mem -m 256M ");
 
 for (i = 0; i < vus_instances; i++) {
-- 
2.44.0




[PATCH for-9.1 v3 06/11] vhost-user: enable frontends on any POSIX system

2024-04-04 Thread Stefano Garzarella
The vhost-user protocol is not really Linux-specific so let's enable
vhost-user frontends for any POSIX system.

In vhost_net.c we use VHOST_FILE_UNBIND which is defined in a Linux
specific header, let's define it for other systems as well.

Signed-off-by: Stefano Garzarella 
---
 meson.build| 1 -
 hw/net/vhost_net.c | 5 +
 hw/block/Kconfig   | 2 +-
 3 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/meson.build b/meson.build
index c9c3217ba4..c19d51501a 100644
--- a/meson.build
+++ b/meson.build
@@ -151,7 +151,6 @@ have_tpm = get_option('tpm') \
 
 # vhost
 have_vhost_user = get_option('vhost_user') \
-  .disable_auto_if(host_os != 'linux') \
   .require(host_os != 'windows',
error_message: 'vhost-user is not available on Windows').allowed()
 have_vhost_vdpa = get_option('vhost_vdpa') \
diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index fd1a93701a..fced429813 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -34,8 +34,13 @@
 #include "standard-headers/linux/virtio_ring.h"
 #include "hw/virtio/vhost.h"
 #include "hw/virtio/virtio-bus.h"
+#if defined(__linux__)
 #include "linux-headers/linux/vhost.h"
+#endif
 
+#ifndef VHOST_FILE_UNBIND
+#define VHOST_FILE_UNBIND -1
+#endif
 
 /* Features supported by host kernel. */
 static const int kernel_feature_bits[] = {
diff --git a/hw/block/Kconfig b/hw/block/Kconfig
index 9e8f28f982..29ee09e434 100644
--- a/hw/block/Kconfig
+++ b/hw/block/Kconfig
@@ -40,7 +40,7 @@ config VHOST_USER_BLK
 bool
 # Only PCI devices are provided for now
 default y if VIRTIO_PCI
-depends on VIRTIO && VHOST_USER && LINUX
+depends on VIRTIO && VHOST_USER
 
 config SWIM
 bool
-- 
2.44.0




[PATCH for-9.1 v3 03/11] libvhost-user: mask F_INFLIGHT_SHMFD if memfd is not supported

2024-04-04 Thread Stefano Garzarella
libvhost-user will panic when receiving VHOST_USER_GET_INFLIGHT_FD
message if MFD_ALLOW_SEALING is not defined, since it's not able
to create a memfd.

VHOST_USER_GET_INFLIGHT_FD is used only if
VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD is negotiated. So, let's mask
that feature if the backend is not able to properly handle these
messages.

Signed-off-by: Stefano Garzarella 
---
 subprojects/libvhost-user/libvhost-user.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/subprojects/libvhost-user/libvhost-user.c 
b/subprojects/libvhost-user/libvhost-user.c
index a11afd1960..1c361ffd51 100644
--- a/subprojects/libvhost-user/libvhost-user.c
+++ b/subprojects/libvhost-user/libvhost-user.c
@@ -1674,6 +1674,16 @@ vu_get_protocol_features_exec(VuDev *dev, VhostUserMsg 
*vmsg)
 features |= dev->iface->get_protocol_features(dev);
 }
 
+/*
+ * If MFD_ALLOW_SEALING is not defined, we are not able to handle
+ * VHOST_USER_GET_INFLIGHT_FD messages, since we can't create a memfd.
+ * Those messages are used only if VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD
+ * is negotiated. A device implementation can enable it, so let's mask
+ * it to avoid a runtime panic.
+ */
+#ifndef MFD_ALLOW_SEALING
+features &= ~(1ULL << VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD);
+#endif
 vmsg_set_reply_u64(vmsg, features);
 return true;
 }
-- 
2.44.0




[PATCH for-9.1 v3 00/11] vhost-user: support any POSIX system (tested on macOS, FreeBSD, OpenBSD)

2024-04-04 Thread Stefano Garzarella
v1: https://patchew.org/QEMU/20240228114759.44758-1-sgarz...@redhat.com/
v2: https://patchew.org/QEMU/20240326133936.125332-1-sgarz...@redhat.com/
v3:
  - rebased on v9.0.0-rc2
  - patch 4: avoiding setting fd non-blocking for messages where we
have memory fd (Eric)
  - patch 9: enriched commit message and documentation to highlight that we
want to mimic memfd (David)

The vhost-user protocol is not really Linux-specific, so let's try support
QEMU's frontends and backends (including libvhost-user) in any POSIX system
with this series. The main use case is to be able to use virtio devices that
we don't have built-in in QEMU (e.g. virtiofsd, vhost-user-vsock, etc.) even
in non-Linux systems.

The first 5 patches are more like fixes discovered at runtime on macOS or
FreeBSD that could go even independently of this series.

Patches 6, 7, and 8 enable building of frontends and backends (including
libvhost-user) with associated code changes to succeed in compilation.

Patch 9 adds `memory-backend-shm` that uses the POSIX shm_open() API to
create shared memory which is identified by an fd that can be shared with
vhost-user backends. This is useful on those systems (like macOS) where
we don't have memfd_create() or special filesystems like "/dev/shm".

Patches 10 and 11 use `memory-backend-shm` in some vhost-user tests.

Maybe the first 5 patches can go separately, but I only discovered those
problems after testing patches 6 - 9, so I have included them in this series
for now. Please let me know if you prefer that I send them separately.

I tested this series using vhost-user-blk and QSD on macOS Sonoma 14.4
(aarch64), FreeBSD 14 (x86_64), OpenBSD 7.4 (x86_64), and Fedora 39 (x86_64)
in this way:

- Start vhost-user-blk or QSD (same commands for all systems)

  vhost-user-blk -s /tmp/vhost.socket \
-b Fedora-Cloud-Base-39-1.5.x86_64.raw

  qemu-storage-daemon \
--blockdev 
file,filename=Fedora-Cloud-Base-39-1.5.x86_64.qcow2,node-name=file \
--blockdev qcow2,file=file,node-name=qcow2 \
--export 
vhost-user-blk,addr.type=unix,addr.path=/tmp/vhost.socket,id=vub,num-queues=1,node-name=qcow2,writable=on

- macOS (aarch64): start QEMU (using hvf accelerator)

  qemu-system-aarch64 -smp 2 -cpu host -M virt,accel=hvf,memory-backend=mem \
-drive 
file=./build/pc-bios/edk2-aarch64-code.fd,if=pflash,format=raw,readonly=on \
-device virtio-net-device,netdev=net0 -netdev user,id=net0 \
-device ramfb -device usb-ehci -device usb-kbd \
-object memory-backend-shm,id=mem,size=512M \
-device vhost-user-blk-pci,num-queues=1,disable-legacy=on,chardev=char0 \
-chardev socket,id=char0,path=/tmp/vhost.socket

- FreeBSD/OpenBSD (x86_64): start QEMU (no accelerators available)

  qemu-system-x86_64 -smp 2 -M q35,memory-backend=mem \
-object memory-backend-shm,id=mem,size="512M" \
-device vhost-user-blk-pci,num-queues=1,chardev=char0 \
-chardev socket,id=char0,path=/tmp/vhost.socket

- Fedora (x86_64): start QEMU (using kvm accelerator)

  qemu-system-x86_64 -smp 2 -M q35,accel=kvm,memory-backend=mem \
-object memory-backend-shm,size="512M" \
-device vhost-user-blk-pci,num-queues=1,chardev=char0 \
-chardev socket,id=char0,path=/tmp/vhost.socket

Branch pushed (and CI started) at 
https://gitlab.com/sgarzarella/qemu/-/tree/macos-vhost-user?ref_type=heads

Thanks,
Stefano

Stefano Garzarella (11):
  libvhost-user: set msg.msg_control to NULL when it is empty
  libvhost-user: fail vu_message_write() if sendmsg() is failing
  libvhost-user: mask F_INFLIGHT_SHMFD if memfd is not supported
  vhost-user-server: do not set memory fd non-blocking
  contrib/vhost-user-blk: fix bind() using the right size of the address
  vhost-user: enable frontends on any POSIX system
  libvhost-user: enable it on any POSIX system
  contrib/vhost-user-blk: enable it on any POSIX system
  hostmem: add a new memory backend based on POSIX shm_open()
  tests/qtest/vhost-user-blk-test: use memory-backend-shm
  tests/qtest/vhost-user-test: add a test case for memory-backend-shm

 docs/system/devices/vhost-user.rst|   5 +-
 meson.build   |   5 +-
 qapi/qom.json |  17 
 subprojects/libvhost-user/libvhost-user.h |   2 +-
 backends/hostmem-shm.c| 118 ++
 contrib/vhost-user-blk/vhost-user-blk.c   |  23 -
 hw/net/vhost_net.c|   5 +
 subprojects/libvhost-user/libvhost-user.c |  76 +-
 tests/qtest/vhost-user-blk-test.c |   2 +-
 tests/qtest/vhost-user-test.c |  23 +
 util/vhost-user-server.c  |  12 +++
 backends/meson.build  |   1 +
 hw/block/Kconfig  |   2 +-
 qemu-options.hx   |  11 ++
 util/meson.build  |   4 +-
 15 files changed, 288 insertions(+), 18 deletions(-)
 create mode 100644 backends/hostmem-shm.c

-- 
2.44.0




  1   2   3   4   5   6   >