Re: [PATCH] efi: libstub/arm: account for firmware reserved memory at the base of RAM

2019-10-14 Thread Chester Lin
On Mon, Oct 14, 2019 at 06:33:09PM +0200, Ard Biesheuvel wrote:
> The EFI stubloader for ARM starts out by allocating a 32 MB window
> at the base of RAM, in order to ensure that the decompressor (which
> blindly copies the uncompressed kernel into that window) does not
> overwrite other allocations that are made while running in the context
> of the EFI firmware.
> 
> In some cases, (e.g., U-Boot running on the Raspberry Pi 2), this is
> causing boot failures because this initial allocation conflicts with
> a page of reserved memory at the base of RAM that contains the SMP spin
> tables and other pieces of firmware data and which was put there by
> the bootloader under the assumption that the TEXT_OFFSET window right
> below the kernel is only used partially during early boot, and will be
> left alone once the memory reservations are processed and taken into
> account.
> 
> So let's permit reserved memory regions to exist in the region starting
> at the base of RAM, and ending at TEXT_OFFSET - 5 * PAGE_SIZE, which is
> the window below the kernel that is not touched by the early boot code.
> 
> Cc: Guillaume Gardet 
> Cc: Chester Lin  
> Signed-off-by: Ard Biesheuvel 
> ---
>  drivers/firmware/efi/libstub/Makefile |  1 +
>  drivers/firmware/efi/libstub/arm32-stub.c | 16 +---
>  2 files changed, 14 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/firmware/efi/libstub/Makefile 
> b/drivers/firmware/efi/libstub/Makefile
> index 0460c7581220..ee0661ddb25b 100644
> --- a/drivers/firmware/efi/libstub/Makefile
> +++ b/drivers/firmware/efi/libstub/Makefile
> @@ -52,6 +52,7 @@ lib-$(CONFIG_EFI_ARMSTUB)   += arm-stub.o fdt.o string.o 
> random.o \
>  
>  lib-$(CONFIG_ARM)+= arm32-stub.o
>  lib-$(CONFIG_ARM64)  += arm64-stub.o
> +CFLAGS_arm32-stub.o  := -DTEXT_OFFSET=$(TEXT_OFFSET)
>  CFLAGS_arm64-stub.o  := -DTEXT_OFFSET=$(TEXT_OFFSET)
>  
>  #
> diff --git a/drivers/firmware/efi/libstub/arm32-stub.c 
> b/drivers/firmware/efi/libstub/arm32-stub.c
> index e8f7aefb6813..47aafeff3e01 100644
> --- a/drivers/firmware/efi/libstub/arm32-stub.c
> +++ b/drivers/firmware/efi/libstub/arm32-stub.c
> @@ -195,6 +195,7 @@ efi_status_t handle_kernel_image(efi_system_table_t 
> *sys_table,
>unsigned long dram_base,
>efi_loaded_image_t *image)
>  {
> + unsigned long kernel_base;
>   efi_status_t status;
>  
>   /*
> @@ -204,9 +205,18 @@ efi_status_t handle_kernel_image(efi_system_table_t 
> *sys_table,
>* loaded. These assumptions are made by the decompressor,
>* before any memory map is available.
>*/
> - dram_base = round_up(dram_base, SZ_128M);
> + kernel_base = round_up(dram_base, SZ_128M);
>  
> - status = reserve_kernel_base(sys_table, dram_base, reserve_addr,
> + /*
> +  * Note that some platforms (notably, the Raspberry Pi 2) put
> +  * spin-tables and other pieces of firmware at the base of RAM,
> +  * abusing the fact that the window of TEXT_OFFSET bytes at the
> +  * base of the kernel image is only partially used at the moment.
> +  * (Up to 5 pages are used for the swapper page table)
> +  */
> + kernel_base += TEXT_OFFSET - 5 * PAGE_SIZE;
> +
> + status = reserve_kernel_base(sys_table, kernel_base, reserve_addr,
>reserve_size);
>   if (status != EFI_SUCCESS) {
>   pr_efi_err(sys_table, "Unable to allocate memory for 
> uncompressed kernel.\n");
> @@ -220,7 +230,7 @@ efi_status_t handle_kernel_image(efi_system_table_t 
> *sys_table,
>   *image_size = image->image_size;
>   status = efi_relocate_kernel(sys_table, image_addr, *image_size,
>*image_size,
> -  dram_base + MAX_UNCOMP_KERNEL_SIZE, 0);
> +  kernel_base + MAX_UNCOMP_KERNEL_SIZE, 0);
>   if (status != EFI_SUCCESS) {
>   pr_efi_err(sys_table, "Failed to relocate kernel.\n");
>   efi_free(sys_table, *reserve_size, *reserve_addr);

Acked-by: Chester Lin 


Re: [PATCH] efi/arm: fix allocation failure when reserving the kernel base

2019-08-21 Thread Chester Lin
On Wed, Aug 21, 2019 at 10:11:01AM +0300, Mike Rapoport wrote:
> On Wed, Aug 21, 2019 at 09:35:16AM +0300, Ard Biesheuvel wrote:
> > On Wed, 21 Aug 2019 at 09:11, Chester Lin  wrote:
> > >
> > > On Tue, Aug 20, 2019 at 03:28:25PM +0300, Ard Biesheuvel wrote:
> > > > On Tue, 20 Aug 2019 at 14:56, Russell King - ARM Linux admin
> > > >  wrote:
> > > > >
> > > > > On Fri, Aug 02, 2019 at 05:38:54AM +, Chester Lin wrote:
> > > > > > diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
> > > > > > index f3ce34113f89..909b11ba48d8 100644
> > > > > > --- a/arch/arm/mm/mmu.c
> > > > > > +++ b/arch/arm/mm/mmu.c
> > > > > > @@ -1184,6 +1184,9 @@ void __init adjust_lowmem_bounds(void)
> > > > > >   phys_addr_t block_start = reg->base;
> > > > > >   phys_addr_t block_end = reg->base + reg->size;
> > > > > >
> > > > > > + if (memblock_is_nomap(reg))
> > > > > > + continue;
> > > > > > +
> > > > > >   if (reg->base < vmalloc_limit) {
> > > > > >   if (block_end > lowmem_limit)
> > > > > >   /*
> > > > >
> > > > > I think this hunk is sane - if the memory is marked nomap, then it 
> > > > > isn't
> > > > > available for the kernel's use, so as far as calculating where the
> > > > > lowmem/highmem boundary is, it effectively doesn't exist and should be
> > > > > skipped.
> > > > >
> > > >
> > > > I agree.
> > > >
> > > > Chester, could you explain what you need beyond this change (and my
> > > > EFI stub change involving TEXT_OFFSET) to make things work on the
> > > > RPi2?
> > > >
> > >
> > > Hi Ard,
> > >
> > > In fact I am working with Guillaume to try booting zImage kernel and 
> > > openSUSE
> > > from grub2.04 + arm32-efistub so that's why we get this issue on RPi2, 
> > > which is
> > > one of the test machines we have. However we want a better solution for 
> > > all
> > > cases but not just RPi2 since we don't want to affect other platforms as 
> > > well.
> > >
> > 
> > Thanks Chester, but that doesn't answer my question.
> > 
> > Your fix is a single patch that changes various things that are only
> > vaguely related. We have already identified that we need to take
> > TEXT_OFFSET (minus some space used by the swapper page tables) into
> > account into the EFI stub if we want to ensure compatibility with many
> > different platforms, and as it turns out, this applies not only to
> > RPi2 but to other platforms as well, most notably the ones that
> > require a TEXT_OFFSET of 0x208000, since they also have reserved
> > regions at the base of RAM.
> > 
> > My question was what else we need beyond:
> > - the EFI stub TEXT_OFFSET fix [0]
> > - the change to disregard NOMAP memblocks in adjust_lowmem_bounds()
> > - what else???
> 
> I think the only missing part here is to ensure that non-reserved memory in
> bank 0 starts from a PMD-aligned address. I believe this could be done if
> EFI stub, but I'm not really familiar with it so this just a semi-educated
> guess :)
>  
> > [0] 
> > https://git.kernel.org/pub/scm/linux/kernel/git/efi/efi.git/commit/?h=next&id=0eb7bad595e52666b642a02862ad996a0f9bfcc0
>

Hi Ard and Mike,

Sorry for my misunderstanding and I agree with Mike. We could still meet the
memblock_limit issue if there's a non-reserved memory in bank0 starts from an
unaligned address.

Regards,
Chester


Re: [PATCH] efi/arm: fix allocation failure when reserving the kernel base

2019-08-20 Thread Chester Lin
On Tue, Aug 20, 2019 at 03:28:25PM +0300, Ard Biesheuvel wrote:
> On Tue, 20 Aug 2019 at 14:56, Russell King - ARM Linux admin
>  wrote:
> >
> > On Fri, Aug 02, 2019 at 05:38:54AM +, Chester Lin wrote:
> > > diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
> > > index f3ce34113f89..909b11ba48d8 100644
> > > --- a/arch/arm/mm/mmu.c
> > > +++ b/arch/arm/mm/mmu.c
> > > @@ -1184,6 +1184,9 @@ void __init adjust_lowmem_bounds(void)
> > >   phys_addr_t block_start = reg->base;
> > >   phys_addr_t block_end = reg->base + reg->size;
> > >
> > > + if (memblock_is_nomap(reg))
> > > + continue;
> > > +
> > >   if (reg->base < vmalloc_limit) {
> > >   if (block_end > lowmem_limit)
> > >   /*
> >
> > I think this hunk is sane - if the memory is marked nomap, then it isn't
> > available for the kernel's use, so as far as calculating where the
> > lowmem/highmem boundary is, it effectively doesn't exist and should be
> > skipped.
> >
> 
> I agree.
> 
> Chester, could you explain what you need beyond this change (and my
> EFI stub change involving TEXT_OFFSET) to make things work on the
> RPi2?
>

Hi Ard,

In fact I am working with Guillaume to try booting zImage kernel and openSUSE
from grub2.04 + arm32-efistub so that's why we get this issue on RPi2, which is
one of the test machines we have. However we want a better solution for all
cases but not just RPi2 since we don't want to affect other platforms as well.

Regards,
Chester


Re: [PATCH] efi/arm: fix allocation failure when reserving the kernel base

2019-08-20 Thread Chester Lin
On Tue, Aug 20, 2019 at 10:49:30AM +0300, Mike Rapoport wrote:
> On Mon, Aug 19, 2019 at 05:56:51PM +0300, Ard Biesheuvel wrote:
> > On Mon, 19 Aug 2019 at 11:01, Chester Lin  wrote:
> > >
> > > Hi Mike and Ard,
> > >
> > > On Thu, Aug 15, 2019 at 04:37:39PM +0300, Mike Rapoport wrote:
> > > > On Thu, Aug 15, 2019 at 02:32:50PM +0300, Ard Biesheuvel wrote:
> > > > > (adding Mike)
> > > > >
> 
> ...
> 
> > > > > > In this case the kernel failed to reserve cma, which should hit the 
> > > > > > issue of
> > > > > > memblock_limit=0x1000 as I had mentioned in my patch description. 
> > > > > > The first
> > > > > > block [0-0xfff] was scanned in adjust_lowmem_bounds(), but it did 
> > > > > > not align
> > > > > > with PMD_SIZE so the cma reservation failed because the 
> > > > > > memblock.current_limit
> > > > > > was extremely low. That's why I expand the first reservation from 1 
> > > > > > PAGESIZE to
> > > > > > 1 PMD_SIZE in my patch in order to avoid this issue. Please kindly 
> > > > > > let me know
> > > > > > if any suggestion, thank you.
> > > >
> > > >
> > > > > This looks like it is a separate issue. The memblock/cma code should
> > > > > not choke on a reserved page of memory at 0x0.
> > > > >
> > > > > Perhaps Russell or Mike (cc'ed) have an idea how to address this?
> > > >
> > > > Presuming that the last memblock dump comes from the end of
> > > > arm_memblock_init() with the this memory map
> > > >
> > > > memory[0x0] [0x-0x0fff], 0x1000 
> > > > bytes flags: 0x4
> > > > memory[0x1] [0x1000-0x07ef5fff], 0x07ef5000 
> > > > bytes flags: 0x0
> > > > memory[0x2] [0x07ef6000-0x07f09fff], 0x00014000 
> > > > bytes flags: 0x4
> > > > memory[0x3] [0x07f0a000-0x3cb3efff], 0x34c35000 
> > > > bytes flags: 0x0
> > > >
> > > > adjust_lowmem_bounds() will set the memblock_limit (and respectively 
> > > > global
> > > > memblock.current_limit) to 0x1000 and any further memblock_alloc*() will
> > > > happily fail.
> > > >
> > > > I believe that the assumption for memblock_limit calculations was that 
> > > > the
> > > > first bank has several megs at least.
> > > >
> > > > I wonder if this hack would help:
> > > >
> > > > diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
> > > > index d9a0038..948e5b9 100644
> > > > --- a/arch/arm/mm/mmu.c
> > > > +++ b/arch/arm/mm/mmu.c
> > > > @@ -1206,7 +1206,7 @@ void __init adjust_lowmem_bounds(void)
> > > >* allocated when mapping the start of bank 0, 
> > > > which
> > > >* occurs before any free memory is mapped.
> > > >*/
> > > > - if (!memblock_limit) {
> > > > + if (memblock_limit < PMD_SIZE) {
> > > >   if (!IS_ALIGNED(block_start, PMD_SIZE))
> > > >   memblock_limit = block_start;
> > > >   else if (!IS_ALIGNED(block_end, PMD_SIZE))
> > > >
> > >
> > > I applied this patch as well and it works well on rpi-2 model B.
> > >
> > 
> > Thanks, Chester, that is good to know.
> > 
> > However, afaict, this only affects systems where physical memory
> > starts at address 0x0, so I think we need a better fix.
> 
> This hack can be easily extended to handle systems with arbitrary start
> address, but it's still a hack...
> 
> > I know Mike has been looking into the NOMAP stuff lately, and your
> > original patch contains a hunk that makes this code (?) disregard
> > nomap memblocks. That might be a better approach.
> 
> I was actually looking how to replace NOMAP with something else to make
> memblock.memory consistent with actual physical memory banks. But this work
> is stashed for now.
> 
> I'm not sure that skipping NOMAP regions would be good enough.
> If I understand corrrectly, with Chester's original patch the reservation
> of PMD aligned chunk of 32M for the kernel made the first conv-mem region
> PMD aligned and then memblock_limit will be set to the end of this region.
> 
> Is there a reason for marking EFI_RESERVED_TYPE as NOMAP rather than simply
> reserve them with memblock_reserve()?
> 

Hi Mike,

I make this change in efistub so I am not sure if memblock_reserve() can be
linked by ld or not. I tried using efi_mem_reserve() but got a linker error of
undefined reference. Is there a better place to call memblock_reserve() after
efistub?

Thanks,
Chester


Re: [PATCH] efi/arm: fix allocation failure when reserving the kernel base

2019-08-20 Thread Chester Lin
On Mon, Aug 19, 2019 at 05:56:51PM +0300, Ard Biesheuvel wrote:
> On Mon, 19 Aug 2019 at 11:01, Chester Lin  wrote:
> >
> > Hi Mike and Ard,
> >
> > On Thu, Aug 15, 2019 at 04:37:39PM +0300, Mike Rapoport wrote:
> > > On Thu, Aug 15, 2019 at 02:32:50PM +0300, Ard Biesheuvel wrote:
> > > > (adding Mike)
> > > >
> > > > On Thu, 15 Aug 2019 at 14:28, Chester Lin  wrote:
> > > > >
> > > > > Hi Ard,
> > > > >
> > > > > On Thu, Aug 15, 2019 at 10:59:43AM +0300, Ard Biesheuvel wrote:
> > > > > > On Sun, 4 Aug 2019 at 10:57, Ard Biesheuvel 
> > > > > >  wrote:
> > > > > > >
> > > > > > > Hello Chester,
> > > > > > >
> > > > > > > On Fri, 2 Aug 2019 at 08:40, Chester Lin  wrote:
> > > > > > > >
> > > > > > > > In some cases the arm32 efistub could fail to allocate memory 
> > > > > > > > for
> > > > > > > > uncompressed kernel. For example, we got the following error 
> > > > > > > > message when
> > > > > > > > verifying EFI stub on Raspberry Pi-2 [kernel-5.2.1 + grub-2.04] 
> > > > > > > > :
> > > > > > > >
> > > > > > > >   EFI stub: Booting Linux Kernel...
> > > > > > > >   EFI stub: ERROR: Unable to allocate memory for uncompressed 
> > > > > > > > kernel.
> > > > > > > >   EFI stub: ERROR: Failed to relocate kernel
> > > > > > > >
> > > > > > > > After checking the EFI memory map we found that the first page 
> > > > > > > > [0 - 0xfff]
> > > > > > > > had been reserved by Raspberry Pi-2's firmware, and the efistub 
> > > > > > > > tried to
> > > > > > > > set the dram base at 0, which was actually in a reserved region.
> > > > > > > >
> > > > > > >
> > > > > > > This by itself is a violation of the Linux boot protocol for 
> > > > > > > 32-bit
> > > > > > > ARM when using the decompressor. The decompressor rounds down its 
> > > > > > > own
> > > > > > > base address to a multiple of 128 MB, and assumes the whole area 
> > > > > > > is
> > > > > > > available for the decompressed kernel and related data structures.
> > > > > > > (The first TEXT_OFFSET bytes are no longer used in practice, 
> > > > > > > which is
> > > > > > > why putting a reserved region of 4 KB bytes works at the moment, 
> > > > > > > but
> > > > > > > this is fragile). Note that the decompressor does not look at any 
> > > > > > > DT
> > > > > > > or EFI provided memory maps *at all*.
> > > > > > >
> > > > > > > So unfortunately, this is not something we can fix in the kernel, 
> > > > > > > but
> > > > > > > we should fix it in the bootloader or in GRUB, so it does not put 
> > > > > > > any
> > > > > > > reserved regions in the first 128 MB of memory,
> > > > > > >
> > > > > >
> > > > > > OK, perhaps we can fix this by taking TEXT_OFFSET into account. The
> > > > > > ARM boot protocol docs are unclear about whether this memory should 
> > > > > > be
> > > > > > used or not, but it is no longer used for its original purpose (page
> > > > > > tables), and the RPi loader already keeps data there.
> > > > > >
> > > > > > Can you check whether the following patch works for you?
> > > > > >
> > > > > > diff --git a/drivers/firmware/efi/libstub/Makefile
> > > > > > b/drivers/firmware/efi/libstub/Makefile
> > > > > > index 0460c7581220..ee0661ddb25b 100644
> > > > > > --- a/drivers/firmware/efi/libstub/Makefile
> > > > > > +++ b/drivers/firmware/efi/libstub/Makefile
> > > > > > @@ -52,6 +52,7 @@ lib-$(CONFIG_EFI_ARMSTUB) += arm-stub.o fdt.o
> > > > > > string.o random.o \
> > > > > >
> > > > > >  lib-$(CONFIG_ARM)  += arm32-stub.o
> > > > >

Re: [PATCH] efi/arm: fix allocation failure when reserving the kernel base

2019-08-19 Thread Chester Lin
Hi Mike and Ard,

On Thu, Aug 15, 2019 at 04:37:39PM +0300, Mike Rapoport wrote:
> On Thu, Aug 15, 2019 at 02:32:50PM +0300, Ard Biesheuvel wrote:
> > (adding Mike)
> > 
> > On Thu, 15 Aug 2019 at 14:28, Chester Lin  wrote:
> > >
> > > Hi Ard,
> > >
> > > On Thu, Aug 15, 2019 at 10:59:43AM +0300, Ard Biesheuvel wrote:
> > > > On Sun, 4 Aug 2019 at 10:57, Ard Biesheuvel  
> > > > wrote:
> > > > >
> > > > > Hello Chester,
> > > > >
> > > > > On Fri, 2 Aug 2019 at 08:40, Chester Lin  wrote:
> > > > > >
> > > > > > In some cases the arm32 efistub could fail to allocate memory for
> > > > > > uncompressed kernel. For example, we got the following error 
> > > > > > message when
> > > > > > verifying EFI stub on Raspberry Pi-2 [kernel-5.2.1 + grub-2.04] :
> > > > > >
> > > > > >   EFI stub: Booting Linux Kernel...
> > > > > >   EFI stub: ERROR: Unable to allocate memory for uncompressed 
> > > > > > kernel.
> > > > > >   EFI stub: ERROR: Failed to relocate kernel
> > > > > >
> > > > > > After checking the EFI memory map we found that the first page [0 - 
> > > > > > 0xfff]
> > > > > > had been reserved by Raspberry Pi-2's firmware, and the efistub 
> > > > > > tried to
> > > > > > set the dram base at 0, which was actually in a reserved region.
> > > > > >
> > > > >
> > > > > This by itself is a violation of the Linux boot protocol for 32-bit
> > > > > ARM when using the decompressor. The decompressor rounds down its own
> > > > > base address to a multiple of 128 MB, and assumes the whole area is
> > > > > available for the decompressed kernel and related data structures.
> > > > > (The first TEXT_OFFSET bytes are no longer used in practice, which is
> > > > > why putting a reserved region of 4 KB bytes works at the moment, but
> > > > > this is fragile). Note that the decompressor does not look at any DT
> > > > > or EFI provided memory maps *at all*.
> > > > >
> > > > > So unfortunately, this is not something we can fix in the kernel, but
> > > > > we should fix it in the bootloader or in GRUB, so it does not put any
> > > > > reserved regions in the first 128 MB of memory,
> > > > >
> > > >
> > > > OK, perhaps we can fix this by taking TEXT_OFFSET into account. The
> > > > ARM boot protocol docs are unclear about whether this memory should be
> > > > used or not, but it is no longer used for its original purpose (page
> > > > tables), and the RPi loader already keeps data there.
> > > >
> > > > Can you check whether the following patch works for you?
> > > >
> > > > diff --git a/drivers/firmware/efi/libstub/Makefile
> > > > b/drivers/firmware/efi/libstub/Makefile
> > > > index 0460c7581220..ee0661ddb25b 100644
> > > > --- a/drivers/firmware/efi/libstub/Makefile
> > > > +++ b/drivers/firmware/efi/libstub/Makefile
> > > > @@ -52,6 +52,7 @@ lib-$(CONFIG_EFI_ARMSTUB) += arm-stub.o fdt.o
> > > > string.o random.o \
> > > >
> > > >  lib-$(CONFIG_ARM)  += arm32-stub.o
> > > >  lib-$(CONFIG_ARM64)+= arm64-stub.o
> > > > +CFLAGS_arm32-stub.o:= -DTEXT_OFFSET=$(TEXT_OFFSET)
> > > >  CFLAGS_arm64-stub.o:= -DTEXT_OFFSET=$(TEXT_OFFSET)
> > > >
> > > >  #
> > > > diff --git a/drivers/firmware/efi/libstub/arm32-stub.c
> > > > b/drivers/firmware/efi/libstub/arm32-stub.c
> > > > index e8f7aefb6813..66ff0c8ec269 100644
> > > > --- a/drivers/firmware/efi/libstub/arm32-stub.c
> > > > +++ b/drivers/firmware/efi/libstub/arm32-stub.c
> > > > @@ -204,7 +204,7 @@ efi_status_t
> > > > handle_kernel_image(efi_system_table_t *sys_table,
> > > >  * loaded. These assumptions are made by the decompressor,
> > > >  * before any memory map is available.
> > > >  */
> > > > -   dram_base = round_up(dram_base, SZ_128M);
> > > > +   dram_base = round_up(dram_base, SZ_128M) + TEXT_OFFSET;
> > > >
> > > > status = reserve_kernel_base(sys_table, dram_base, reserve_addr,
> > &

Re: [PATCH] efi/arm: fix allocation failure when reserving the kernel base

2019-08-15 Thread Chester Lin
Hi Ard,

On Thu, Aug 15, 2019 at 10:59:43AM +0300, Ard Biesheuvel wrote:
> On Sun, 4 Aug 2019 at 10:57, Ard Biesheuvel  wrote:
> >
> > Hello Chester,
> >
> > On Fri, 2 Aug 2019 at 08:40, Chester Lin  wrote:
> > >
> > > In some cases the arm32 efistub could fail to allocate memory for
> > > uncompressed kernel. For example, we got the following error message when
> > > verifying EFI stub on Raspberry Pi-2 [kernel-5.2.1 + grub-2.04] :
> > >
> > >   EFI stub: Booting Linux Kernel...
> > >   EFI stub: ERROR: Unable to allocate memory for uncompressed kernel.
> > >   EFI stub: ERROR: Failed to relocate kernel
> > >
> > > After checking the EFI memory map we found that the first page [0 - 0xfff]
> > > had been reserved by Raspberry Pi-2's firmware, and the efistub tried to
> > > set the dram base at 0, which was actually in a reserved region.
> > >
> >
> > This by itself is a violation of the Linux boot protocol for 32-bit
> > ARM when using the decompressor. The decompressor rounds down its own
> > base address to a multiple of 128 MB, and assumes the whole area is
> > available for the decompressed kernel and related data structures.
> > (The first TEXT_OFFSET bytes are no longer used in practice, which is
> > why putting a reserved region of 4 KB bytes works at the moment, but
> > this is fragile). Note that the decompressor does not look at any DT
> > or EFI provided memory maps *at all*.
> >
> > So unfortunately, this is not something we can fix in the kernel, but
> > we should fix it in the bootloader or in GRUB, so it does not put any
> > reserved regions in the first 128 MB of memory,
> >
> 
> OK, perhaps we can fix this by taking TEXT_OFFSET into account. The
> ARM boot protocol docs are unclear about whether this memory should be
> used or not, but it is no longer used for its original purpose (page
> tables), and the RPi loader already keeps data there.
> 
> Can you check whether the following patch works for you?
> 
> diff --git a/drivers/firmware/efi/libstub/Makefile
> b/drivers/firmware/efi/libstub/Makefile
> index 0460c7581220..ee0661ddb25b 100644
> --- a/drivers/firmware/efi/libstub/Makefile
> +++ b/drivers/firmware/efi/libstub/Makefile
> @@ -52,6 +52,7 @@ lib-$(CONFIG_EFI_ARMSTUB) += arm-stub.o fdt.o
> string.o random.o \
> 
>  lib-$(CONFIG_ARM)  += arm32-stub.o
>  lib-$(CONFIG_ARM64)+= arm64-stub.o
> +CFLAGS_arm32-stub.o:= -DTEXT_OFFSET=$(TEXT_OFFSET)
>  CFLAGS_arm64-stub.o:= -DTEXT_OFFSET=$(TEXT_OFFSET)
> 
>  #
> diff --git a/drivers/firmware/efi/libstub/arm32-stub.c
> b/drivers/firmware/efi/libstub/arm32-stub.c
> index e8f7aefb6813..66ff0c8ec269 100644
> --- a/drivers/firmware/efi/libstub/arm32-stub.c
> +++ b/drivers/firmware/efi/libstub/arm32-stub.c
> @@ -204,7 +204,7 @@ efi_status_t
> handle_kernel_image(efi_system_table_t *sys_table,
>  * loaded. These assumptions are made by the decompressor,
>  * before any memory map is available.
>  */
> -   dram_base = round_up(dram_base, SZ_128M);
> +   dram_base = round_up(dram_base, SZ_128M) + TEXT_OFFSET;
> 
> status = reserve_kernel_base(sys_table, dram_base, reserve_addr,
>  reserve_size);
> 

I tried your patch on rpi2 and got the following panic. Just a reminder that I
have replaced some log messages with ".." since it might be too long to
post all.

In this case the kernel failed to reserve cma, which should hit the issue of
memblock_limit=0x1000 as I had mentioned in my patch description. The first
block [0-0xfff] was scanned in adjust_lowmem_bounds(), but it did not align
with PMD_SIZE so the cma reservation failed because the memblock.current_limit
was extremely low. That's why I expand the first reservation from 1 PAGESIZE to
1 PMD_SIZE in my patch in order to avoid this issue. Please kindly let me know
if any suggestion, thank you.

boot-log:


Loading Linux test ...
EFI stub: Booting Linux Kernel...
EFI stub: Using DTB from configuration table
EFI stub: Exiting boot services and installing virtual address map...
Uncompressing Linux... done, booting the kernel.
[0.00] Booting Linux on physical CPU 0xf00
[0.00] Linux version 5.2.1-lpae (chester@linux-8mug) (..)
[0.00] CPU: ARMv7 Processor [410fc075] revision 5 (ARMv7), cr=30c5387d
[0.00] CPU: div instructions available: patching division code
[0.00] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing 
instruction cache
[0.00] OF: fdt: Machine model: Raspberry Pi 2 Model B Rev 1.1
[0.00] printk: bootconsole [earlycon0] enabled
[0.

[PATCH] efi/arm: fix allocation failure when reserving the kernel base

2019-08-01 Thread Chester Lin
In some cases the arm32 efistub could fail to allocate memory for
uncompressed kernel. For example, we got the following error message when
verifying EFI stub on Raspberry Pi-2 [kernel-5.2.1 + grub-2.04] :

  EFI stub: Booting Linux Kernel...
  EFI stub: ERROR: Unable to allocate memory for uncompressed kernel.
  EFI stub: ERROR: Failed to relocate kernel

After checking the EFI memory map we found that the first page [0 - 0xfff]
had been reserved by Raspberry Pi-2's firmware, and the efistub tried to
set the dram base at 0, which was actually in a reserved region.

  grub> lsefimmap
  Type  Physical start  - end #PagesSize Attributes
  reserved  -0fff 0001  4KiB WB
  conv-mem  1000-07ef5fff 7ef5 130004KiB WB
  RT-data   07ef6000-07f09fff 0014 80KiB RT WB
  conv-mem  07f0a000-2d871fff 00025968 615840KiB WB
  .

To avoid a reserved address, we have to ignore the memory regions which are
marked as EFI_RESERVED_TYPE, and only conventional memory regions can be
chosen. If the region before the kernel base is unaligned, it will be
marked as EFI_RESERVED_TYPE and let kernel ignore it so that memblock_limit
will not be sticked with a very low address such as 0x1000.

Signed-off-by: Chester Lin 
---
 arch/arm/mm/mmu.c |  3 ++
 drivers/firmware/efi/libstub/arm32-stub.c | 43 ++-
 2 files changed, 37 insertions(+), 9 deletions(-)

diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index f3ce34113f89..909b11ba48d8 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -1184,6 +1184,9 @@ void __init adjust_lowmem_bounds(void)
phys_addr_t block_start = reg->base;
phys_addr_t block_end = reg->base + reg->size;
 
+   if (memblock_is_nomap(reg))
+   continue;
+
if (reg->base < vmalloc_limit) {
if (block_end > lowmem_limit)
/*
diff --git a/drivers/firmware/efi/libstub/arm32-stub.c 
b/drivers/firmware/efi/libstub/arm32-stub.c
index e8f7aefb6813..10d33d36df00 100644
--- a/drivers/firmware/efi/libstub/arm32-stub.c
+++ b/drivers/firmware/efi/libstub/arm32-stub.c
@@ -128,7 +128,7 @@ static efi_status_t reserve_kernel_base(efi_system_table_t 
*sys_table_arg,
 
for (l = 0; l < map_size; l += desc_size) {
efi_memory_desc_t *desc;
-   u64 start, end;
+   u64 start, end, spare, kernel_base;
 
desc = (void *)memory_map + l;
start = desc->phys_addr;
@@ -144,27 +144,52 @@ static efi_status_t 
reserve_kernel_base(efi_system_table_t *sys_table_arg,
case EFI_BOOT_SERVICES_DATA:
/* Ignore types that are released to the OS anyway */
continue;
-
+   case EFI_RESERVED_TYPE:
+   /* Ignore reserved regions */
+   continue;
case EFI_CONVENTIONAL_MEMORY:
/*
 * Reserve the intersection between this entry and the
 * region.
 */
start = max(start, (u64)dram_base);
-   end = min(end, (u64)dram_base + MAX_UNCOMP_KERNEL_SIZE);
+   kernel_base = round_up(start, PMD_SIZE);
+   spare = kernel_base - start;
+   end = min(end, kernel_base + MAX_UNCOMP_KERNEL_SIZE);
+
+   status = efi_call_early(allocate_pages,
+   EFI_ALLOCATE_ADDRESS,
+   EFI_LOADER_DATA,
+   MAX_UNCOMP_KERNEL_SIZE / EFI_PAGE_SIZE,
+   &kernel_base);
+   if (status != EFI_SUCCESS) {
+   pr_efi_err(sys_table_arg,
+   "reserve_kernel_base: alloc failed.\n");
+   goto out;
+   }
+   *reserve_addr = kernel_base;
 
+   if (!spare)
+   break;
+   /*
+* If there's a gap between start and kernel_base,
+* it needs be reserved so that the memblock_limit
+* will not fall on a very low address when running
+* adjust_lowmem_bounds(), wchich could eventually
+* cause CMA reservation issue.
+*/
status = efi_call_early(allocate_pages,
EFI_ALLOCATE_ADDRESS,
-   EFI_LOADER_DATA,
-