Dom0: Use streaming decompression for ZSTD compressed kernels

Rafaël Kooi Wed, 10 May 2023 10:30:38 -0700

On 10/05/2023 11:48, Jan Beulich wrote:> First of all - please don't drop Cc-s 
when replying. I'm restoring

xen-devel@ here at least.


Apologies, I didn't notice that replying dropped the Cc-s. Should I send
the emails again with the proper Cc-s?

On 10.05.2023 10:51, Rafaël Kooi wrote:

On 10/05/2023 10:03, Jan Beulich wrote:> On 10.05.2023 02:18, Rafaël Kooi wrote:

On Arch Linux kernel decompression will fail when Xen has been unified
with the kernel and initramfs as a single binary. This change works for
both streaming and non-streaming ZSTD content.


This could do with better explaining what "unified" means, and how
streaming decompression actually makes a difference.


I don't mind explaining it further, but with the efi documentation for
it existing on xenbits, should I just refer to that?


You may of course refer to existing documentation. Iirc that doesn't
cover any compression aspects, though.


Right, I'll think about what ends up being the clearest explanation.

--- a/xen/common/decompress.c
+++ b/xen/common/decompress.c
@@ -3,11 +3,26 @@
   #include <xen/string.h>
   #include <xen/decompress.h>

+typedef struct _ZSTD_state

+{
+    void *write_buf;
+    unsigned int write_pos;
+} ZSTD_state;
+
   static void __init cf_check error(const char *msg)
   {
       printk("%s\n", msg);
   }

+static int __init cf_check ZSTD_flush(void *buf, unsigned int pos,

+                                      void *userptr)
+{
+    ZSTD_state *state = (ZSTD_state*)userptr;
+    memcpy(state->write_buf + state->write_pos, buf, pos);
+    state->write_pos += pos;
+    return pos;
+}


This doesn't really belong here, but will (I expect) go away anyway once
you drop the earlier patch.


The ZSTD_flush will have to stay, as that is how the decompressor will
start streaming decompression. The difference will be that the book
keeping will be "global" (to the translation unit).


But this bookkeeping should be entirely in zstd code (i.e. presumably
unzstd.c).


The implementation of the decompression functions seem to indicate
otherwise. Referring to unzstd.c:`unzstd`, the function will take the
streaming decompression path if either `fill` or `flush` have been
supplied. I cross checked with unlzma.c and unxz.c, and that seems to
have similar behavior in regards to flushing the output data. The
`flush` function is passed a buffer to a chunk of decompressed data with
`pos` being the size of the chunk. For the sake of consistency I don't
think it's a good idea to deviate from this behavior in just unzstd.c.

I could wrap the decompression code in another file and function, but
in my opinion it should stay here and be renamed to something generic
like `stream_flush` or `chunk_flush`.

       if ( len >= 4 && !memcmp(inbuf, "\x28\xb5\x2f\xfd", 4) )
-       return unzstd(inbuf, len, NULL, NULL, outbuf, NULL, error);
+    {
+        // NOTE (Rafaël): On Arch Linux the kernel is compressed in a way
+        // that requires streaming ZSTD decompression. Otherwise decompression
+        // will fail when using a unified EFI binary. Somehow decompression
+        // works when not using a unified EFI binary, I suspect this is the
+        // kernel self decompressing. Or there is a code path that I am not
+        // aware of that takes care of the use case properly.


Along the lines of what I've said for the description, this wants to avoid
terms like "somehow" if at all possible.


I've used the term "somehow" because I don't know why decompression
works when Xen loads the kernel from the EFI file system. I assume the
kernel still gets unpacked by Xen, right? Or does the kernel unpack
itself?


The handling of Dom0 kernel decompression ought to be entirely independent
of EFI vs legacy. Unless I'm wrong with that (mis-remembering), you
mentioning EFI is potentially misleading. And yes, at least on x86 the
kernel is decompressed by Xen (by peeking into the supplied bzImage). The
difference between a plain bzImage and a "unified EFI binary" is what you
will want to outline in the description (and at least mention in the
comment). What I'm wondering is whether there simply is an issue with size
determination when the kernel is taken from the .kernel section.


Assuming you are talking about size determination of the compressed
bzImage, I notice a discrepancy in the size of the ZSTD stream and the
reported size by the vmlinuz-* header upon further investigation. When
using the streaming decompression code I made it output how many bytes
it reads from the extracted-but-still-compressed bzImage. The code
reads 12,327,560 bytes, but the size of the compressed bzImage in the
header is 12,327,564 bytes. In xen/arch/x86/bzImage.c `decompress` is
called with `orig_image_len`, when the function `output_length`
calculates the end address and removes 4 bytes from that address. If I
remove the last 4 bytes from the compressed bzImage then `unzstd` will
unpack it with `unzstd bzImage.zst -o bzImage`, otherwise it will
complain with `zstd: /*stdin*\: unknown header`. With this new
information I think the correct solution is to try calling `decompress`
a second time with with `orig_image_len - 4` if it fails.

When I present the v2 of this patch, do I add you as a reviewer? Or will
that be done by the merger?


I'm afraid I don't understand the question. You will continue to Cc
respective maintainers, which will include me. In case you refer to a
Reviewed-by: tag - you can only add such tags once they were offered to
you by the respective person. For this specific one it doesn't mean "an
earlier version of this was looked at by <person>" but "this is deemed
okay by <person>".

Jan


I meant the "Reviewed-by:" tag indeed. Thanks again.

Rafaël

Re: [XEN PATCH 2/2] x86/Dom0: Use streaming decompression for ZSTD compressed kernels

Reply via email to