Re: kexec failures with DEBUG_RODATA
On Wed, Jun 15, 2016 at 11:42:28PM +0100, Russell King - ARM Linux wrote: > The boot environment must be setup such that there is room for the > uncompressed image (aligned currently to 256 bytes) followed by the > size of the compressed image, with any appended DTBs included. > Anything which is located below that is likely to get trampled by > the decompressor. There's a question that's been lingering, which hasn't been brought up by anyone yet: should we even support zImage files with an appended DTB for kexec? When kexec is operating in DTB mode, it will provide the DTB to the kernel from the host environment, possibly with some modifications (eg, to change the command line arguments.) Since an appended DTB would override the kexec-provided DTB, this would wipe out those changes, some of which are necessary for things like crashdump to work. So, I'm thinking that kexec should not support a zImage with appended DTB - in fact, I think it should truncate the zImage to be loaded, and ensure that the word following the zImage does not contain a DTB magic number, in case the zImage has been built with CONFIG_ARM_APPENDED_DTB enabled. Thoughts? -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: kexec failures with DEBUG_RODATA
On Wed, Jun 15, 2016 at 10:13:59PM +0100, Russell King - ARM Linux wrote: > On Tue, Jun 14, 2016 at 11:05:23AM -0700, Kees Cook wrote: > > I'm much less familiar with the ARM decompression stub, but is there a > > boot image header (like x86 has)? If not, perhaps we can invent one, > > and it can carry all the details needed for a bootloader to do the > > right things. > > With a bit of tinkering around, I now have this: > > <.data>: >0: e1a0nop ; (mov r0, r0) >4: e1a0nop ; (mov r0, r0) >8: e1a0nop ; (mov r0, r0) >c: e1a0nop ; (mov r0, r0) > 10: e1a0nop ; (mov r0, r0) > 14: e1a0nop ; (mov r0, r0) > 18: e1a0nop ; (mov r0, r0) > 1c: e1a0nop ; (mov r0, r0) > 20: ea0fb 0x64 > > Then follows the existing "header" which we've had there for years: > > 24: 016f2818 ; LE magic number > 28: ; LE zImage start address (always zero now) > 2c: 00431fe0 ; LE zImage _edata > 30: 04030201 ; endian flag > > And now comes the new header: > > 34: 016f2818 ; LE magic number > 38: 0001 ; LE version number (v1) > 3c: 01287000 ; LE total space required for decompressor > 40: 00e54000 ; LE uncompressed image size > > Up to 64 bytes available here for future expansion, currently filled > with zeros. > ... > > Remainder of the zImage code: > 64: e10f9000mrs r9, CPSR Looking at this again, this can't work for another _two_ reasons: 1. Thumb2 kernels - we rely on a relative address to jump to for the remainder of the zImage code, using an "adr" instruction. The offset needs to be known at assembly time, but with my approach of moving the header into the linker script, this is no longer the case. 2. EFI images need the header offset at 0x3c, which gets in the way of us appending to our existing header. So... I don't think we can solve it this way, and I'm all out of ideas how to solve this in a sane manner - I'm currently of the opinion that it _isn't_ solvable given where we are without defining a new format for a zImage, which is really quite depressing. I think people are just going to have to get used to using --image-size to kexec on ARM to work around this problem. So, let's document it in kexec's --help - though this could do with something better, but I fear it'll turn into a multi-line description of the option: kexec/arch/arm/kexec-zImage-arm.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kexec/arch/arm/kexec-zImage-arm.c b/kexec/arch/arm/kexec-zImage-arm.c index 9400d1f4..45c587c2 100644 --- a/kexec/arch/arm/kexec-zImage-arm.c +++ b/kexec/arch/arm/kexec-zImage-arm.c @@ -122,6 +122,8 @@ void zImage_arm_usage(void) " --initrd=FILE Use FILE as the kernel's initial ramdisk.\n" " --ramdisk=FILEUse FILE as the kernel's initial ramdisk.\n" " --dtb=FILEUse FILE as the fdt blob.\n" + " --image-size=IMAGE_SIZE\n" + " Reserve IMAGE_SIZE between kernel and initrd.\n" " --atags Use ATAGs instead of device-tree.\n" " --page-offset=PAGE_OFFSET\n" " Set PAGE_OFFSET of crash dump vmcore\n" -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: kexec failures with DEBUG_RODATA
On Tue, Jun 21, 2016 at 05:18:49PM +0530, Pratyush Anand wrote: > On 16/06/2016:12:13:03 AM, Russell King - ARM Linux wrote: > > I should point out that this method should work for a zImage without > > an appended DTB, and we have no way to update this header block for > > the appended DTB case. > > Well, I might be missing with my limited knowledge. So, just to clarify, > zImage > is "compressed image + uncompressor". In some cases it might have "+DTB". > So, zImage should be located atleast "aligned uncompressed size" above "kernel > base" and, initrd should be placed at "kernel base" + "aligned uncompressed > size" + "compressed image size" + "uncompressor size" + DTB(if any). > > However, I am unable to understand that why can't we have a flag in zImage > header block, which tells that whether a DTB has been appended in the zImage > or > not? cat zImage foo.dtb > newImage obviously doesn't result in the header block inside zImage being modified. The only way around that is to have a tool which concatenates the two and updates the header. At that point, it's no longer simply a concatenation. > > So, an alternative standpoint is that we supply only the uncompressed > > image size. Then, the boot environment needs to understand that they > > must allow for the compressed image and any appended DTB on top of that > > (which it would see as one - the size of the combined image.) > > So, why not we can have all three information in header ie. "uncompressed > image > size", "compressed image size" and "DTB size" if appended flag is set. How big is the DTB size? Without a tool to update the header, there's no way to really know. > > However, while that may sound like a good idea, we're falling into the > > same trap that we've fallen into at the beginning of this thread: the > > boot environment has to understand how the decompressor currently works, > > and if we were to change that, this we're back to the calculation which > > the boot environment is using not matching reality. > > Currently we are discussing between 4 times to 11 times. So, if we know > *atleast* "uncompressed image size" then this heuristic variation can be > minimized a lot. If we do not provide header, then may be we can provide > library > to the kexec-tools which will enable us to know the length of "uncompressed > image" when "compressed image" is provided as input. If we provide the uncompressed size, then we know how big the uncompressed image will be. That much is fine, but there's more than that. What is the padding applied to the uncompressed size? Where does the compressed image self-relocate to? Where does the compressed image relocate the DTB to? What is the spacing between the DTB and the compressed image? All those are knowledge which is internal to the decompressor code. If we supply, say, the padded uncompressed size, then we can work around not knowing the padding in kexec. However, we still need to know the other details - so we still end up needing to code into kexec (or other boot environment) a load of details specific to the current decompressor implementation. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: kexec failures with DEBUG_RODATA
Hi Russell, On 16/06/2016:12:13:03 AM, Russell King - ARM Linux wrote: > On Wed, Jun 15, 2016 at 03:54:38PM -0700, Kees Cook wrote: > > On Wed, Jun 15, 2016 at 3:42 PM, Russell King - ARM Linux > >wrote: > > > In fact, the apparent confusion over this reinforces my belief that we > > > should _not_ give the size of the uncompressed image at all. > > > > > > The boot environment must be setup such that there is room for the > > > uncompressed image (aligned currently to 256 bytes) followed by the > > > size of the compressed image, with any appended DTBs included. > > > Anything which is located below that is likely to get trampled by > > > the decompressor. > > > > Okay, sounds reasonable to me. :) > > I should point out that this method should work for a zImage without > an appended DTB, and we have no way to update this header block for > the appended DTB case. Well, I might be missing with my limited knowledge. So, just to clarify, zImage is "compressed image + uncompressor". In some cases it might have "+DTB". So, zImage should be located atleast "aligned uncompressed size" above "kernel base" and, initrd should be placed at "kernel base" + "aligned uncompressed size" + "compressed image size" + "uncompressor size" + DTB(if any). However, I am unable to understand that why can't we have a flag in zImage header block, which tells that whether a DTB has been appended in the zImage or not? > > So, an alternative standpoint is that we supply only the uncompressed > image size. Then, the boot environment needs to understand that they > must allow for the compressed image and any appended DTB on top of that > (which it would see as one - the size of the combined image.) So, why not we can have all three information in header ie. "uncompressed image size", "compressed image size" and "DTB size" if appended flag is set. > > However, while that may sound like a good idea, we're falling into the > same trap that we've fallen into at the beginning of this thread: the > boot environment has to understand how the decompressor currently works, > and if we were to change that, this we're back to the calculation which > the boot environment is using not matching reality. Currently we are discussing between 4 times to 11 times. So, if we know *atleast* "uncompressed image size" then this heuristic variation can be minimized a lot. If we do not provide header, then may be we can provide library to the kexec-tools which will enable us to know the length of "uncompressed image" when "compressed image" is provided as input. ~Pratyush > > -- > RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ > FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up > according to speedtest.net. ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: kexec failures with DEBUG_RODATA
On Wed, Jun 15, 2016 at 03:54:38PM -0700, Kees Cook wrote: > On Wed, Jun 15, 2016 at 3:42 PM, Russell King - ARM Linux >wrote: > > In fact, the apparent confusion over this reinforces my belief that we > > should _not_ give the size of the uncompressed image at all. > > > > The boot environment must be setup such that there is room for the > > uncompressed image (aligned currently to 256 bytes) followed by the > > size of the compressed image, with any appended DTBs included. > > Anything which is located below that is likely to get trampled by > > the decompressor. > > Okay, sounds reasonable to me. :) I should point out that this method should work for a zImage without an appended DTB, and we have no way to update this header block for the appended DTB case. So, an alternative standpoint is that we supply only the uncompressed image size. Then, the boot environment needs to understand that they must allow for the compressed image and any appended DTB on top of that (which it would see as one - the size of the combined image.) However, while that may sound like a good idea, we're falling into the same trap that we've fallen into at the beginning of this thread: the boot environment has to understand how the decompressor currently works, and if we were to change that, this we're back to the calculation which the boot environment is using not matching reality. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: kexec failures with DEBUG_RODATA
On Wed, Jun 15, 2016 at 3:42 PM, Russell King - ARM Linuxwrote: > In fact, the apparent confusion over this reinforces my belief that we > should _not_ give the size of the uncompressed image at all. > > The boot environment must be setup such that there is room for the > uncompressed image (aligned currently to 256 bytes) followed by the > size of the compressed image, with any appended DTBs included. > Anything which is located below that is likely to get trampled by > the decompressor. Okay, sounds reasonable to me. :) -Kees -- Kees Cook Chrome OS & Brillo Security ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: kexec failures with DEBUG_RODATA
On Wed, Jun 15, 2016 at 2:13 PM, Russell King - ARM Linuxwrote: > On Tue, Jun 14, 2016 at 11:05:23AM -0700, Kees Cook wrote: >> I'm much less familiar with the ARM decompression stub, but is there a >> boot image header (like x86 has)? If not, perhaps we can invent one, >> and it can carry all the details needed for a bootloader to do the >> right things. > > With a bit of tinkering around, I now have this: > > <.data>: >0: e1a0nop ; (mov r0, r0) >4: e1a0nop ; (mov r0, r0) >8: e1a0nop ; (mov r0, r0) >c: e1a0nop ; (mov r0, r0) > 10: e1a0nop ; (mov r0, r0) > 14: e1a0nop ; (mov r0, r0) > 18: e1a0nop ; (mov r0, r0) > 1c: e1a0nop ; (mov r0, r0) > 20: ea0fb 0x64 > > Then follows the existing "header" which we've had there for years: > > 24: 016f2818; LE magic number > 28: ; LE zImage start address (always zero now) > 2c: 00431fe0; LE zImage _edata > 30: 04030201; endian flag > > And now comes the new header: > > 34: 016f2818; LE magic number Should this be a different magic from the existing header's magic? > 38: 0001; LE version number (v1) Should a "size" follow the version number instead of the explicit 64 bytes of zeros? > 3c: 01287000; LE total space required for decompressor > 40: 00e54000; LE uncompressed image size > > Up to 64 bytes available here for future expansion, currently filled > with zeros. > ... > > Remainder of the zImage code: > 64: e10f9000mrs r9, CPSR > > I'm rather on the fence whether we need to give the uncompressed image > size - the important thing is the size of memory that's required for > the decompressor to run, which is sizeof(uncompressed kernel) rounded > up to 256 bytes, and the relocated decompressor image size. I think it's important information since it allows a boot loader to figure out if there's room in a given range for the result. While there's no relocation support yet, if we gain it on ARM and we want to support KASLR, it will be very handy to have the uncompressed size available. > > The "total space required for decompressor" is slightly cheating at the > figure - I'm including the uncompressed image rounded up and the entire > compressed image in that size, so it's a safe over-estimate. > > I'm not sure there's a need to provide the uncompressed image size, the > boot environment shouldn't have a reason to know that, so I'm tempted to > omit it. > > We could dispense with the endian conversions, and push the responsibility > for interpreting that onto the reader of this data: we have the endian > flag in the existing header block, so the boot environment can work out > the endianness of the image and apply fixups as appropriate. > > Why generate this in the linker script? We need the size of the zImage > here, which is only known to the linker. > > diff --git a/arch/arm/boot/compressed/Makefile > b/arch/arm/boot/compressed/Makefile > index d50430c40045..1d5467e05250 100644 > --- a/arch/arm/boot/compressed/Makefile > +++ b/arch/arm/boot/compressed/Makefile > @@ -119,6 +119,10 @@ asflags-y := -DZIMAGE > KBSS_SZ = $(shell $(CROSS_COMPILE)size $(obj)/../../../../vmlinux | \ > awk 'END{print $$3}') > LDFLAGS_vmlinux = --defsym _kernel_bss_size=$(KBSS_SZ) > + > +KERNEL_IMAGE_SIZE = $(shell stat -c '%s' $(obj)/../Image) > +LDFLAGS_vmlinux += --defsym _kernel_image_size=$(KERNEL_IMAGE_SIZE) > + > # Supply ZRELADDR to the decompressor via a linker symbol. > ifneq ($(CONFIG_AUTO_ZRELADDR),y) > LDFLAGS_vmlinux += --defsym zreladdr=$(ZRELADDR) > diff --git a/arch/arm/boot/compressed/head.S b/arch/arm/boot/compressed/head.S > index e2e0dcb42ca2..395c60dcc4f7 100644 > --- a/arch/arm/boot/compressed/head.S > +++ b/arch/arm/boot/compressed/head.S > @@ -131,11 +131,7 @@ start: > THUMB(badrr12, 1f ) > THUMB(bx r12 ) > > - .word _magic_sig @ Magic numbers to help the loader > - .word _magic_start@ absolute load/run zImage address > - .word _magic_end @ zImage end address > - .word 0x04030201 @ endianness flag > - > + .section ".start2", #alloc, #execinstr > THUMB(.thumb ) > 1: __EFI_HEADER > > diff --git a/arch/arm/boot/compressed/vmlinux.lds.S > b/arch/arm/boot/compressed/vmlinux.lds.S > index 81c493156ce8..77267724ec8a 100644 >
Re: kexec failures with DEBUG_RODATA
On Tue, Jun 14, 2016 at 11:05:23AM -0700, Kees Cook wrote: > I'm much less familiar with the ARM decompression stub, but is there a > boot image header (like x86 has)? If not, perhaps we can invent one, > and it can carry all the details needed for a bootloader to do the > right things. With a bit of tinkering around, I now have this: <.data>: 0: e1a0nop ; (mov r0, r0) 4: e1a0nop ; (mov r0, r0) 8: e1a0nop ; (mov r0, r0) c: e1a0nop ; (mov r0, r0) 10: e1a0nop ; (mov r0, r0) 14: e1a0nop ; (mov r0, r0) 18: e1a0nop ; (mov r0, r0) 1c: e1a0nop ; (mov r0, r0) 20: ea0fb 0x64 Then follows the existing "header" which we've had there for years: 24: 016f2818; LE magic number 28: ; LE zImage start address (always zero now) 2c: 00431fe0; LE zImage _edata 30: 04030201; endian flag And now comes the new header: 34: 016f2818; LE magic number 38: 0001; LE version number (v1) 3c: 01287000; LE total space required for decompressor 40: 00e54000; LE uncompressed image size Up to 64 bytes available here for future expansion, currently filled with zeros. ... Remainder of the zImage code: 64: e10f9000mrs r9, CPSR I'm rather on the fence whether we need to give the uncompressed image size - the important thing is the size of memory that's required for the decompressor to run, which is sizeof(uncompressed kernel) rounded up to 256 bytes, and the relocated decompressor image size. The "total space required for decompressor" is slightly cheating at the figure - I'm including the uncompressed image rounded up and the entire compressed image in that size, so it's a safe over-estimate. I'm not sure there's a need to provide the uncompressed image size, the boot environment shouldn't have a reason to know that, so I'm tempted to omit it. We could dispense with the endian conversions, and push the responsibility for interpreting that onto the reader of this data: we have the endian flag in the existing header block, so the boot environment can work out the endianness of the image and apply fixups as appropriate. Why generate this in the linker script? We need the size of the zImage here, which is only known to the linker. diff --git a/arch/arm/boot/compressed/Makefile b/arch/arm/boot/compressed/Makefile index d50430c40045..1d5467e05250 100644 --- a/arch/arm/boot/compressed/Makefile +++ b/arch/arm/boot/compressed/Makefile @@ -119,6 +119,10 @@ asflags-y := -DZIMAGE KBSS_SZ = $(shell $(CROSS_COMPILE)size $(obj)/../../../../vmlinux | \ awk 'END{print $$3}') LDFLAGS_vmlinux = --defsym _kernel_bss_size=$(KBSS_SZ) + +KERNEL_IMAGE_SIZE = $(shell stat -c '%s' $(obj)/../Image) +LDFLAGS_vmlinux += --defsym _kernel_image_size=$(KERNEL_IMAGE_SIZE) + # Supply ZRELADDR to the decompressor via a linker symbol. ifneq ($(CONFIG_AUTO_ZRELADDR),y) LDFLAGS_vmlinux += --defsym zreladdr=$(ZRELADDR) diff --git a/arch/arm/boot/compressed/head.S b/arch/arm/boot/compressed/head.S index e2e0dcb42ca2..395c60dcc4f7 100644 --- a/arch/arm/boot/compressed/head.S +++ b/arch/arm/boot/compressed/head.S @@ -131,11 +131,7 @@ start: THUMB(badrr12, 1f ) THUMB(bx r12 ) - .word _magic_sig @ Magic numbers to help the loader - .word _magic_start@ absolute load/run zImage address - .word _magic_end @ zImage end address - .word 0x04030201 @ endianness flag - + .section ".start2", #alloc, #execinstr THUMB(.thumb ) 1: __EFI_HEADER diff --git a/arch/arm/boot/compressed/vmlinux.lds.S b/arch/arm/boot/compressed/vmlinux.lds.S index 81c493156ce8..77267724ec8a 100644 --- a/arch/arm/boot/compressed/vmlinux.lds.S +++ b/arch/arm/boot/compressed/vmlinux.lds.S @@ -37,6 +37,19 @@ SECTIONS .text : { _start = .; *(.start) +_header = .; +LONG(ZIMAGE_MAGIC(0x016f2818));/* Magic numbers to help the loader */ +LONG(ZIMAGE_MAGIC(_start));/* absolute load/run zImage address */ +LONG(ZIMAGE_MAGIC(_edata));/* zImage end address */ +LONG(0x04030201); /* Endianness flag */ +LONG(ZIMAGE_MAGIC(0x016f2818));/* Further header indicator */ +LONG(ZIMAGE_MAGIC(1)); /* Version 1 */ +LONG(ZIMAGE_MAGIC(((_kernel_image_size + 255) & ~ 255) + \ + _edata
Re: kexec failures with DEBUG_RODATA
On Wed, Jun 15, 2016 at 01:25:08PM +0530, Pratyush Anand wrote: > Sure, having a header information would be handy to do it. Other alternative > could be that we define "HAVE_LIBZ" and then we can have something like > kexec-Image-arm.c which handles plane Image. We can also have something like > get_zlib_decompressed_length() which can give us exact length we need for > kernel > and then we can place initrd accordingly in zImage_arm_load(). I really don't want to do that. There are things that the decompressor does which make it easier to deal with the zImage than the Image. > I see at least another issue clearly in ARM kernel code with > CONFIG_DEBUG_RODATA > enabled. When CONFIG_DEBUG_RODATA is enabled, we can not write text area. > kexec_start_address has been defined in relocate_kernel.S as a text area. > machine_kexec() writes at kexec_start_address with image->start. Similarly > there > would be issues for overwriting of kexec_indirection_page, kexec_mach_type and > kexec_boot_atags. If arm mmu mapping configures text pages as RO, then we > should > see an abort as soon as we do these writes. That's not a problem, because set_kernel_text_rw() is called immediately prior to writing, which has the effect of allowing these writes. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: kexec failures with DEBUG_RODATA
On 06/14/16 at 11:05am, Kees Cook wrote: > On Tue, Jun 14, 2016 at 10:59 AM, Russell King - ARM Linux >wrote: > > Since the kernel now has CONFIG_DEBUG_RODATA by default, this means > > that these kinds of ratio-based assumptions are even more invalid > > than they have been. > > > > Right now, a zImage doesn't advertise the size of its uncompressed > > image, but I think with things like CONFIG_DEBUG_RODATA, we can no > > longer make assumptions like we have done in the past, and we need > > the zImage to provide this information so that the boot environment > > can be setup sanely by boot loaders/kexec rather than relying on > > broken heuristics like this. > > > > Thoughts? > > I'm much less familiar with the ARM decompression stub, but is there a > boot image header (like x86 has)? If not, perhaps we can invent one, > and it can carry all the details needed for a bootloader to do the > right things. Yes, x86 stores addr and size of initrd into boot header. When decompressing kernel it will choose a safe starting position before the loaded place according to the max evaluation of decompressing algorithm. ARM only use a rough 4 times evaluation, sounds too hasty. Simplest way is to increase times to 8 for now. The final way should be as Kees suggested. Thanks Baoquan ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: kexec failures with DEBUG_RODATA
On Tue, Jun 14, 2016 at 10:59 AM, Russell King - ARM Linuxwrote: > Guys, > > Having added Keystone2 support to kexec, and asking TI to validate > linux-next with mainline kexec-tools, I received two reports from > them. > > The first was a report of success, but was kexecing a 4.4 kernel > from linux-next. > > The second was a failure report, kexecing current linux-next from > linux-next on this platform. However, my local tests (using my > 4.7-rc3 derived kernel) showed there to be no problem. > > Building my 4.7-rc3 derived kernel with TI's configuration they > were using with linux-next similarly failed. So, it came down to > a configuration difference. > > After trialling several configurations, it turns out that the > failure is, in part, caused by CONFIG_DEBUG_RODATA being enabled > on TI's kernel but not mine. Why should this make any difference? > > Well, CONFIG_DEBUG_RODATA has the side effect that the kernel > contains a lot of additional padding - we pad out to section size > (1MB) the ELF sections with differing attributes. This should not > normally be a problem, except kexec contains this assumption: > > /* Otherwise, assume the maximum kernel compression ratio > * is 4, and just to be safe, place ramdisk after that */ > initrd_base = base + _ALIGN(len * 4, getpagesize()); > > Now, first things first. Don't get misled by the comment - it's > totally false. That may be what's desired, but that is far from > what actually happens in reality. > > "base" is _not_ the address of the start of the kernel image, but > is the base address of the start of the region that the kernel is > to be loaded into - remember that the kernel is normally loaded > 32k higher than the start of memory. This 32k offset is _not_ > included in either "base" nor "len". So, even if we did want to > assume that there was a maximum compression ratio of 4, the above > always calculates 32k short of that value. > > The other invalid thing here is this whole "maximum kernel compression > ratio" assumption. Consider this non-DEBUG_RODATA kernel image: > >textdata bss dec hex filename > 6583513 2273816 215344 9072673 8a7021 ../build/ks2/vmlinux > > This results in an image and zimage of: > -rwxrwxr-x 1 rmk rmk 8871936 Jun 14 18:02 ../build/ks2/arch/arm/boot/Image > -rwxrwxr-x 1 rmk rmk 4381592 Jun 14 18:02 ../build/ks2/arch/arm/boot/zImage > > which is a ratio of about a 49%. On entry to the decompressor, the > compressed image will be relocated above the expected resulting > kernel size. So, let's say that it's relocated to 9MB. This means > the zImage will occupy around 9MB-14MB above the start of memory. > Going by the 4x ratio, we place the other images at 16.7MB. This > leaves around 2.7MB free. So that's probably fine... but think > about this. We assumed a ratio of 4x, but really we're in a rather > tight squeeze - we actually have only about 50% of the compressed > image size spare. > > Now let's look at the DEBUG_RODATA case: > >textdata bss dec hex filename > 6585305 2273952 215344 9074601 8a77a9 ../build/ks2/vmlinux > > And the resulting sizes: > -rwxrwxr-x 1 rmk rmk 15024128 Jun 14 18:49 ../build/ks2/arch/arm/boot/Image > -rwxrwxr-x 1 rmk rmk 4399040 Jun 14 18:49 ../build/ks2/arch/arm/boot/zImage > > That's a compression ratio of about 29%. Still within the 4x limit, > but going through the same calculation above shows that we end up > totally overflowing the available space this time. > > That's exactly the same kernel configuration except for > CONFIG_DEBUG_RODATA - enabling this has almost _doubled_ the > decompressed image size without affecting the compressed size. > > We've known for some time that this ratio of 4x doesn't work - we > used to use the same assumption in the decompressor when self- > relocating, and we found that there are images which achieve a > better compression ratio and make this invalid. Yet, the 4x thing > has persisted in kexec code... and buggily too. > > Since the kernel now has CONFIG_DEBUG_RODATA by default, this means > that these kinds of ratio-based assumptions are even more invalid > than they have been. > > Right now, a zImage doesn't advertise the size of its uncompressed > image, but I think with things like CONFIG_DEBUG_RODATA, we can no > longer make assumptions like we have done in the past, and we need > the zImage to provide this information so that the boot environment > can be setup sanely by boot loaders/kexec rather than relying on > broken heuristics like this. > > Thoughts? I'm much less familiar with the ARM decompression stub, but is there a boot image header (like x86 has)? If not, perhaps we can invent one, and it can carry all the details needed for a bootloader to do the right things. -Kees -- Kees Cook Chrome OS & Brillo Security ___ kexec mailing list