On Sat, Apr 17, 2021 at 2:56 PM Rob Landley <r...@landley.net> wrote:
> On 4/16/21 1:44 PM, Yi-yo Chiang wrote: > > I'm not sure what Elliot's goal is? I assume he's trying to extract a > > concatenated ramdisk, and I still see a problem in the current solution. > > > > The buffer-format > > ( > https://www.kernel.org/doc/Documentation/early-userspace/buffer-format.txt) > says: > > > > initramfs := ("\0" | cpio_archive | cpio_gzip_archive)* > > > > In other words, both `cat a.cpio b.cpio >merged.cpio` and `(cat a.cpio > && echo > > -n -e '\0\0\0' && cat b.cpio) >merged.cpio` are valid initramfs. > > It also implies that two compressed files can be concatenated and > separated by > arbirary runs of nulls, or you can have a compressed file and a > non-compressed > file concatenated, or... > Correct. Upon further inspection, it's actually "arbitrary NULLs could prepend a GZIP(cpio_archive)", "arbitrary 4-aligned NULLS prepend a *uncompressed* cpio_archive" and "cpio_file/cpio_trailer within a cpio_archive have to be 4-aligned with arbitrary NULLs". initramfs.c seems to try very hard to respect the alignment requirement, but I guess we could just skip *ANY* extra NULLs for simplicity? > > Grrr. I need to test this. And possibly genericize the tar.c code to detect > compression type and run it through a decompressor so cpio can do it too... > Sounds like another can of worms... :/ The buffer-format.txt seems to be a bit outdated, as Linux now supports a lot of compression types besides gzip, and all of which are configurable ( https://elixir.bootlin.com/linux/latest/source/lib/decompress.c#L52). So the initramfs grammar implemented by initramfs.c is in reality: initramfs := ("\0" | cpio_archive | compressed_cpio_archive)* compressed_cpio_archive := CONFIG_COMPRESSION_ALGORITHM(cpio_archive) CONFIG_COMPRESSION_ALGORITHM := GZIP | BZIP2 | LZMA | XZ | LZO | LZ4 | ZSTD where the exact set of compression algorithms are decided by the kernel config. > > > btw gen_init_cpio.c also pads initramfs to 512-byte boundary > > ( > https://github.com/torvalds/linux/blob/6fbd6cf85a3be127454a1ad58525a3adcf8612ab/usr/gen_init_cpio.c#L97 > ) > > *blink* *blink* Why...? (cpio doesn't have a 512 stride in the file > format? It > has a 4-byte stride for padding strings with NUL bytes, but that's about > it?) > > > If we're viewing buffer-format.txt as the "right" cpio spec, then I > think we > > should implement this too. We should skip arbitrary extra NUL-bytes > padded > > between cpio file frames > > Skipping arbitrary extra null bytes at the start is easy enough to do. I > guess > the hardwired trailing read was expecting the 512 padding... > > I'm gonna need add a _lot_ more test suite entries for this command. > > Ok, skip arbitrary leading NUL bytes after each entry, pad last record to > 512 > byte alignment with NUL bytes, autodetect compression type at each record > start, > implement hardlinks and have TRAILER!!! flush hardlink context... > > I'm not so sure about padding the last entry to 512-byte boundary. 512 looks like a random value to me? (Or an implementation detail of GNU cpio and gen_init_cpio). Nonetheless I think we should pad the last record to 4-byte boundary, so that both cat a.cpio.gz b.cpio.gz >c.cpio.gz and zcat a.cpio.gz b.cpio.gz >c.cpio are valid initramfs/cpio? Rob > -- Yi-yo Chiang Software Engineer yochi...@google.com
_______________________________________________ Toybox mailing list Toybox@lists.landley.net http://lists.landley.net/listinfo.cgi/toybox-landley.net