Re: [Qemu-devel] [PATCH v2 3/4] 9p: persistency of QID path beyond reboots / suspensions

2019-05-07 Thread Greg Kurz
On Fri, 03 May 2019 18:23:03 +0200
Christian Schoenebeck  wrote:

> This patch aims to keep QID path identical beyond the scope of reboots and
> guest suspensions. With the 1st patch alone the QID path of the same files
> might change after reboots / suspensions, since 9p would restart with
> empty qpp_table and the resulting QID path depends on the precise sequence
> of files being accessed on guest.
> 
> The first patch should already avoid the vast majority of potential QID
> path collisions. However especially network services running on guest
> would still be prone to QID path issues when just using the 1st patch.
> For instance Samba is exporting file IDs to clients in the network and
> SMB cliens in the network will use those IDs to access and request
> changes on the file server. If guest is now interrupted in between, like
> it commonly happens on maintenance, e.g. software updates on host, then
> SMB clients in the network will continue working with old file IDs, which
> in turn leads to data corruption and data loss on the file server.
> Furthermore on SMB client side I also encountered severe misbehaviours in
> this case, for instance Macs accessing the file server would either
> start to hang or die with a kernel panic within seconds, since the smbx
> implementation on macOS heavily relies on file IDs being unique (within
> the context of a connection that is).
> 
> So this patch here mitigates the remaining problem described above by
> storing the qpp_table persistently as extended attribute(s) on the
> exported root of the file system and automatically tries to restore the
> qpp_table i.e. after reboots / resumptions.
> 
> This patch is aimed at real world scenarios, in which qpp_table will only
> ever get few dozens of entries (and none ever in qpf_table). So it is e.g.
> intentionally limited to only store qpp_table, not qpf_table; and so far
> I have not made optimizations, since in practice the qpf_table is really
> just tiny.
> 
> Since there is currently no callback in qemu yet that would reliably be
> called on guest shutdowns, the table is stored on every new insertion for
> now.
> 
> Signed-off-by: Christian Schoenebeck 
> ---
>  hw/9pfs/9p.c | 315 
> ++-
>  hw/9pfs/9p.h |  33 +++
>  2 files changed, 343 insertions(+), 5 deletions(-)
> 

Yikes... both the changelog and the diffstat have an impressive size. Since
I'm likely the only QEMU developer who will review this, please expect some
delay before I get down to it... :-\ Of course, if you can split this into
smaller patches, it will probably help :)

Same remark for patch 4.

> diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
> index 2b893e25a1..29c6dfc68a 100644
> --- a/hw/9pfs/9p.c
> +++ b/hw/9pfs/9p.c
> @@ -26,6 +26,19 @@
>  #include "migration/blocker.h"
>  #include "sysemu/qtest.h"
>  #include "qemu/xxhash.h"
> +#include "qemu/crc32c.h"
> +#if defined(__linux__) /* TODO: This should probably go into osdep.h instead 
> */
> +# include  /* for XATTR_SIZE_MAX */
> +#endif
> +
> +/*
> + * How many bytes may we store to fs per extended attribute value?
> + */
> +#ifdef XATTR_SIZE_MAX
> +# define ATTR_MAX_SIZE XATTR_SIZE_MAX /* Linux only: 64kB limit in kernel */
> +#else
> +# define ATTR_MAX_SIZE 65536 /* Most systems allow a bit more, so we take 
> this as basis.  */
> +#endif
>  
>  int open_fd_hw;
>  int total_open_fd;
> @@ -642,6 +655,285 @@ static int qid_path_fullmap(V9fsPDU *pdu, const struct 
> stat *stbuf,
>  return 0;
>  }
>  
> +static inline bool is_ro_export(FsContext *ctx)
> +{
> +return ctx->export_flags & V9FS_RDONLY;
> +}
> +
> +/*
> + * Once qpp_table size exceeds this value, we no longer save
> + * the table persistently. See comment in v9fs_store_qpp_table()
> + */
> +#define QPP_TABLE_PERSISTENCY_LIMIT 32768
> +
> +/* Remove all user.virtfs.system.qidp.* xattrs from export root. */
> +static void remove_qidp_xattr(FsContext *ctx)
> +{
> +V9fsString name;
> +int i;
> +
> +/* just for a paranoid endless recursion sanity check */
> +const ssize_t max_size =
> +sizeof(QppSrlzHeader) +
> +QPP_TABLE_PERSISTENCY_LIMIT * sizeof(QppEntryS);
> +
> +v9fs_string_init();
> +for (i = 0; i * ATTR_MAX_SIZE < max_size; ++i) {
> +v9fs_string_sprintf(, "user.virtfs.system.qidp.%d", i);
> +if (lremovexattr(ctx->fs_root, name.data) < 0)
> +break;
> +}
> +v9fs_string_free();
> +}
> +
> +/* Used to convert qpp hash table into continuous stream. */
> +static void qpp_table_serialize(void *p, uint32_t h, void *up)
> +{
> +const QppEntry *entry = (const QppEntry*) p;
> +QppSerialize *ser = (QppSerialize*) up;
> +
> +if (ser->error)
> +return;
> +
> +/* safety first */
> +if (entry->qp_prefix - 1 >= ser->count) {
> +ser->error = -1;
> +return;
> +}
> +
> +ser->elements[entry->qp_prefix - 1] = (QppEntryS) {
> +.dev = entry->dev,
> +

[Qemu-devel] [PATCH v2 3/4] 9p: persistency of QID path beyond reboots / suspensions

2019-05-03 Thread Christian Schoenebeck via Qemu-devel
This patch aims to keep QID path identical beyond the scope of reboots and
guest suspensions. With the 1st patch alone the QID path of the same files
might change after reboots / suspensions, since 9p would restart with
empty qpp_table and the resulting QID path depends on the precise sequence
of files being accessed on guest.

The first patch should already avoid the vast majority of potential QID
path collisions. However especially network services running on guest
would still be prone to QID path issues when just using the 1st patch.
For instance Samba is exporting file IDs to clients in the network and
SMB cliens in the network will use those IDs to access and request
changes on the file server. If guest is now interrupted in between, like
it commonly happens on maintenance, e.g. software updates on host, then
SMB clients in the network will continue working with old file IDs, which
in turn leads to data corruption and data loss on the file server.
Furthermore on SMB client side I also encountered severe misbehaviours in
this case, for instance Macs accessing the file server would either
start to hang or die with a kernel panic within seconds, since the smbx
implementation on macOS heavily relies on file IDs being unique (within
the context of a connection that is).

So this patch here mitigates the remaining problem described above by
storing the qpp_table persistently as extended attribute(s) on the
exported root of the file system and automatically tries to restore the
qpp_table i.e. after reboots / resumptions.

This patch is aimed at real world scenarios, in which qpp_table will only
ever get few dozens of entries (and none ever in qpf_table). So it is e.g.
intentionally limited to only store qpp_table, not qpf_table; and so far
I have not made optimizations, since in practice the qpf_table is really
just tiny.

Since there is currently no callback in qemu yet that would reliably be
called on guest shutdowns, the table is stored on every new insertion for
now.

Signed-off-by: Christian Schoenebeck 
---
 hw/9pfs/9p.c | 315 ++-
 hw/9pfs/9p.h |  33 +++
 2 files changed, 343 insertions(+), 5 deletions(-)

diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
index 2b893e25a1..29c6dfc68a 100644
--- a/hw/9pfs/9p.c
+++ b/hw/9pfs/9p.c
@@ -26,6 +26,19 @@
 #include "migration/blocker.h"
 #include "sysemu/qtest.h"
 #include "qemu/xxhash.h"
+#include "qemu/crc32c.h"
+#if defined(__linux__) /* TODO: This should probably go into osdep.h instead */
+# include  /* for XATTR_SIZE_MAX */
+#endif
+
+/*
+ * How many bytes may we store to fs per extended attribute value?
+ */
+#ifdef XATTR_SIZE_MAX
+# define ATTR_MAX_SIZE XATTR_SIZE_MAX /* Linux only: 64kB limit in kernel */
+#else
+# define ATTR_MAX_SIZE 65536 /* Most systems allow a bit more, so we take this 
as basis.  */
+#endif
 
 int open_fd_hw;
 int total_open_fd;
@@ -642,6 +655,285 @@ static int qid_path_fullmap(V9fsPDU *pdu, const struct 
stat *stbuf,
 return 0;
 }
 
+static inline bool is_ro_export(FsContext *ctx)
+{
+return ctx->export_flags & V9FS_RDONLY;
+}
+
+/*
+ * Once qpp_table size exceeds this value, we no longer save
+ * the table persistently. See comment in v9fs_store_qpp_table()
+ */
+#define QPP_TABLE_PERSISTENCY_LIMIT 32768
+
+/* Remove all user.virtfs.system.qidp.* xattrs from export root. */
+static void remove_qidp_xattr(FsContext *ctx)
+{
+V9fsString name;
+int i;
+
+/* just for a paranoid endless recursion sanity check */
+const ssize_t max_size =
+sizeof(QppSrlzHeader) +
+QPP_TABLE_PERSISTENCY_LIMIT * sizeof(QppEntryS);
+
+v9fs_string_init();
+for (i = 0; i * ATTR_MAX_SIZE < max_size; ++i) {
+v9fs_string_sprintf(, "user.virtfs.system.qidp.%d", i);
+if (lremovexattr(ctx->fs_root, name.data) < 0)
+break;
+}
+v9fs_string_free();
+}
+
+/* Used to convert qpp hash table into continuous stream. */
+static void qpp_table_serialize(void *p, uint32_t h, void *up)
+{
+const QppEntry *entry = (const QppEntry*) p;
+QppSerialize *ser = (QppSerialize*) up;
+
+if (ser->error)
+return;
+
+/* safety first */
+if (entry->qp_prefix - 1 >= ser->count) {
+ser->error = -1;
+return;
+}
+
+ser->elements[entry->qp_prefix - 1] = (QppEntryS) {
+.dev = entry->dev,
+.ino_prefix = entry->ino_prefix
+};
+ser->done++;
+}
+
+/*
+ * Tries to store the current qpp_table as extended attribute(s) on the
+ * exported file system root with the goal to preserve identical qids
+ * beyond the scope of reboots.
+ */
+static void v9fs_store_qpp_table(V9fsState *s)
+{
+FsContext *ctx = >ctx;
+V9fsString name;
+int i, res;
+size_t size;
+QppSrlzStream* stream;
+QppSerialize ser;
+
+if (is_ro_export(ctx))
+return;
+
+/*
+ * Whenever we exceeded some certain (arbitrary) high qpp_table size we
+ * delete the stored table from the file