On Sat, 31 Aug 2019 at 15:31, Christian Lamparter <chunk...@gmail.com> wrote: > > Hello, > > On Saturday, August 31, 2019 2:09:55 PM CEST Jonas Gorski wrote: > > On Sat, 31 Aug 2019 at 01:19, Christian Lamparter <chunk...@gmail.com> > > wrote: > > > > > > On Friday, August 30, 2019 11:10:54 PM CEST Russell Senior wrote: > > > > >>>>> "Christian" == Christian Lamparter <chunk...@gmail.com> writes: > > > > > > > > Christian> Ok. > > > > > > > > Christian> I did push a patch titled: "build: remove harmful -nopad > > > > Christian> option from mksquashfs" > > > > Christian> > > > > <https://git.openwrt.org/?p=openwrt/openwrt.git;a=commit;h=1c0290c5cc6258c48b8ba46b4f9c85a21de4f875> > > > > > > > > Christian> so, let's see if this triggers more undefined behaviour and > > > > Christian> exposes more hidden broken code. > > > > > > > > Just to re-iterate my earlier worry, if for example: > > > > > > > > kernel_size + rootfs_with_padding_size > > > > > > > > crosses an erase block boundary that: > > > > > > > > kernel_size + rootfs_without_padding_size > > > > > > > > does not, then we'll be wasting an erase block of flash space on NOR. > > > Not to worry. I guess that part of the magic is also explained in the > > > wiki: > > > <https://openwrt.org/docs/techref/flash.layout#example_flash_partitioning> > > > <https://openwrt.org/docs/techref/flash.layout#explanations> > > > In case my attempt now gets too confusing :-/. > > > > > > The "rootfs" partition also contains the rootfs_data inside. It should > > > be possible to verify this with any of the routers that use the "firmware" > > > parition label. > > > > > > Please check /proc/mtd. For example on my WNDR3700v2 (which is very > > > similar > > > to your WNDR3800). It looks like this > > > > > > |# cat /proc/mtd > > > |dev: size erasesize name > > > |mtd0: 00050000 00010000 "u-boot" > > > |mtd1: 00020000 00010000 "u-boot-env" > > > |mtd2: 00f80000 00010000 "firmware" > > > |mtd3: 0018c440 00010000 "kernel" > > > |mtd4: 00df3bc0 00010000 "rootfs" > > > |mtd5: 00620000 00010000 "rootfs_data" > > > |mtd6: 00010000 00010000 "art > > > > > > The rootfs is 14629824 Bytes = 13.95 MiB. (The kernel + uboot + env + art > > > fills out the remaining ~2MiB to a total of 16 MiB). So the padding we > > > both > > > are talking about already has to exists between the squashfs portion and > > > the > > > jffs2/overlay portion inside the "rootfs" partition and it's a full > > > erase-size > > > block. Sadly, OpenWrt does not readily print the "end" of just the > > > squashfs > > > partition (as it does with the kernel partition above) and your message > > > from > > > earlier with the three routers didn't include the "squashfs-split" and its > > > results. (I think that maybe this is the missing information that got > > > lost?) > > > > After a bit more investigation, those devices that use padjffs2 (or > > have the rootfs > > start at an at least 4k aligned offset) will be fine. > > > > The issue are those devices with an unaligned rootfs start, which put their > > own > > EOF marker in the image by the firmware util (e.g. brcm63xx or tp-link > > devices). > > > > There it can happen that e.g. we get > > > > 0x18fff2 <squashfs end> 00 00 00 00 > > 0x19000 00 00 00 00 ... <end of padding> FF FF > > 0x19010 FF FF FF ... > > * > > 0x20000 DE AD C0 DE > > > > The 0xff are only a guess, I haven't checked what the different > > firmware utils use for padding the end of the rootfs - but mksquashfs > > definitely uses 0x00. > > > > which leads to: > > > > 1. squashfs-split will put the roofs_data partition start at 0x19000 > > because that's the first aligned erase block after the end of the > > rootfs start + squashfs length > > 2. jffs2 will see that the first block is neither a valid jff2 node, > > an EOF marker, nor all 0xff > > 3. jffs2 will refuse to erase/mount the filesystem (AFAIU the code) [1] > > > > So removing the -nopad might actually break those devices. > > > > We could fix this case by making sure that mksquashfs and all firmware > > utils use 0xff's to pad (so the erase block will then be treated as > > empty/all 0xff). But then there is the question how jffs2 reacts if > > the first block is 0xff, and the second block is a valid jffs2 node, > > which happens when we sysupgrade with keeping config on NOR devices. > > The jffs2 code isn't the most readable code ... . > > > No need to worry, see one of the previous mails in this thread: > > http://lists.infradead.org/pipermail/openwrt-devel/2019-August/018638.html > > It contains a patch at the end titled: > "[PATCH] base-files: pad root.squashfs to 64KiB in ubi volumes" > This is another approach that just deals with the UBI+squashfs > issue but works with "-nopad". Soooooo.... do we all agree there?
a) 64k is excessive, we only need 4k (actually 1k would be enough, since we don't enable CONFIG_SQUASHFS_4K_DEVBLK_SIZE). The referenced issue with 64k page size happens when loop-mounting a squashfs, since loop defaults to PAGE_SIZE as its block size. But we never do that in OpenWrt, and we don't support any targets with that huge PAGE_SIZE - biggest is ARC with 8k. b) it misses the squashfs's in generic sysupgrade images itself - we need to pad their length as well, to avoid breaking devices with a sysupgrade image hitting the corner case being flashed from an unfixed firmware with the old nand.sh. Also IMHO "1c0290c5cc6258c48b8ba46b4f9c85a21de4f875" should be reverted, for the previously mentioned issues. Regards Jonas _______________________________________________ openwrt-devel mailing list openwrt-devel@lists.openwrt.org https://lists.openwrt.org/mailman/listinfo/openwrt-devel