On 21 July 2015 at 10:39, Gao, Liming <liming....@intel.com> wrote:
> Ard:
>   I review your changes in GenSec, GenFfs and GenFv tool. This change assumes 
> there is only one section in FFS with the specific alignment. If there are 
> more than one sections have the different alignments in single FFS, this 
> patch may not work. For example, the first section aligns at 16 byte, the 
> second section aligns at 4K. This patch can make sure the second section data 
> still aligns at 4K, but the first section data may not align at 16 byte. And, 
> another case is that the first section aligns at 1K, the second section 
> aligns at 4K, FFS align is 4K. This patch will adjust the first section align 
> at 4K, the second section may not align at 4k.

Hello Liming,

Thanks for taking the time to review my patches.

I agree that my proposed alignment changes are flawed atm, but I don't
think your analysis is entirely correct either.

My changes will ensure that, if a special padding section is
encountered, it will be reduced in size to align the first byte of the
payload of the following FFS section to the alignment required for the
entire file. So whether a 4 KB follows a 16 byte aligned section or
the other way around, the global alignment of the FFS will be that
maximum of those values, and compensating for the misalignment of the
start of the payload of the FFS will ensure that all sections end up
at their minimum required alignment.

However, my patch does not work correctly if the special padding
section comes after sections that may need alignment as well. For
instance, imagine an FFS consisting of a section requiring 16 byte
alignment followed by a special padding section and a section
requiring 4 KB alignment. In this case, my changes will result in the
padding section to be reduced, and all sections starting with the 4 KB
section will be at the correct alignment, but the first section may
end up incorrectly aligned.

I will go back to the patches, and figure out if there is a way around
this. I think it may be feasible to only emit the special padding
section if none of the sections that preceded it required any
alignment at all. That way, we can be sure that changing the size of
the padding section moves all sections that have alignment
requirements (that all come after the padding section) will be shifted
into place.

>
>   For GenFw tool change, I agree it.
>

Thanks.

-- 
Ard.

> Sent: Saturday, July 18, 2015 3:38 AM
> To: Andrew Fish
> Cc: Eugene Cohen; edk2-devel@lists.01.org; olivier.mar...@arm.com; 
> leif.lindh...@linaro.org; Gao, Liming; Liu, Yingke D; Justen, Jordan L; 
> ler...@redhat.com
> Subject: Re: [edk2] [RFC PATCH 0/8] small C model and LLVM/clang support for 
> AARCH64
>
> On 17 July 2015 at 21:28, Andrew Fish <af...@apple.com> wrote:
>>
>>> On Jul 17, 2015, at 12:07 PM, Cohen, Eugene <eug...@hp.com> wrote:
>>>
>>> Ard,
>>>
>>>> The bottom line is that we are still losing around 40 KB on padding
>>>> in total,
>>>
>>> If I understand correctly you're saying because of limitations in the 
>>> AArch64 LLVM/clang toolchain we must suffer some sort of code size increase 
>>> (of which you made great progress in minimizing) because of its inherent 
>>> alignment requirements.  For those of us that are extremely sensitive to 
>>> XIP code size (flash / SRAM constraints) this is not ideal.  I'm normally 
>>> in favor of supporting more toolchains but this size increase is difficult 
>>> to swallow.
>>>
>>> Since ARM has a compiler team that works on clang/LLVM (used ARM Compiler 
>>> 6) would it be totally unreasonable to a ask about supporting the 'large C' 
>>> model?
>>>
>>> What would be required to continue to support the 'large C' (ironically 
>>> more space efficient for XIP) as an option for those of us building with 
>>> GCC that want to optimize fully for size?  Is it as simple as a condition 
>>> that changes the -mcmodel parameter in tools_def?
>>>
>>
>> If the sections require 4K alignment, does not that imply the image needs to 
>> be 4K aligned? That means there is going to be dead space between the PEIMs 
>> in the FV, no tool is ever going to fix that.
>
> Well, I did find a way to combine leading padding in an FFS with leading 
> padding in the FV, by making the leading padding in the FFS GUID 
> identifiable, and adjusting it on the fly when generating the FV.
> This recovers 4 KB per XIP module.
>
>> I’d also point out that the PEI Core is aggressive about shadowing PEIMs, so 
>> you probably need to deal with relocations too. I think at a minimum the PEI 
>> Core and DXE IPL (if I remember correctly) both run XIP and get shadowed. 
>> Conceptually you can always compress the always shadowed PEIMs, so the 
>> alignment will not be a big issue in that case.
>>
>> It seems a 4K aligned image is always going to larger than a 32 byte aligned 
>> image. It seems like it would be easier to use different linker flags for 
>> PEI (32 byte alignment), and not PEI (4K alignment)? Is there some feature 
>> requiring 4K alignment in PEI?
>>
>
> The problem leading up to this was the fact that LLVM/clang for
> AArch64 does not support the generation of code that can be executed at any 
> offset. The ADRP/ADD and ADRP/LDR instruction pairs used to emit PC-relative 
> symbols references have a fundamental reliance on preserving their build time 
> alignment at run time.
>
> --
> Ard.
>
>
>>> -----Original Message-----
>>> From: edk2-devel [mailto:edk2-devel-boun...@lists.01.org] On Behalf Of Ard 
>>> Biesheuvel
>>> Sent: Friday, July 17, 2015 6:22 AM
>>> To: edk2-devel@lists.01.org; olivier.mar...@arm.com; 
>>> leif.lindh...@linaro.org; liming....@intel.com; yingke.d....@intel.com
>>> Cc: jordan.l.jus...@intel.com; ler...@redhat.com; Ard Biesheuvel 
>>> <ard.biesheu...@linaro.org>
>>> Subject: [edk2] [RFC PATCH 0/8] small C model and LLVM/clang support for 
>>> AARCH64
>>>
>>> Warning: long email ahead. TL;DR: with some reshuffling of the binaries, we 
>>> can
>>> mitigate the fallout of increasing the section alignment to 4 KB, which is a
>>> prerequisite for proper support of LLVM and the GCC small C model for 
>>> AArch64.
>>>
>>> This is an RFC series, although there are few controversial changes as far 
>>> as I
>>> can tell.
>>>
>>> -*-
>>>
>>> On AARCH64, we have been using the so-called 'large' C model, which makes no
>>> assumptions about the relative distance between symbols and any references 
>>> to
>>> it. This implies that all symbol references are absolute 64-bit pointers, 
>>> and
>>> need to be fixed up at relocation time.
>>>
>>> The more common 'small' C model, which is also the only model supported by
>>> LLVM/clang for AArch64, assumes a maximum relative distance of 4 GB, and 
>>> uses
>>> ADRP/ADD and ADRP/LDR pairs to emit relative symbol references, that do not
>>> require fixups at relocation time. However, there is a downside to the small
>>> C model: since ADRP uses page-based rounding, the page-relative alignment of
>>> these instruction pairs needs to be preserved, which implies using 4 KB 
>>> section
>>> alignment in the intermediate ELF and final PE/COFF executable images.
>>>
>>> Such increased section alignment is currently supported by the BaseTools, 
>>> but
>>> due to various reasons, the wasted space is substantial. For non-XIP 
>>> modules,
>>> this is not a problem since these are usually compressed anyway. For XIP
>>> modules, however, this is a big deal.
>>>
>>> This series attempts to address those issues, by removing sections, and 
>>> reducing
>>> the redundant padding caused by the way FFS file are embedded into FVs.
>>>
>>> This shows how a sequence of a SEC module, the PEI core module and a couple 
>>> of
>>> PEIMs are laid out in an FV when building ArmVirtQemu.dsc. The SEC module is
>>> aligned to 0x800 (2 KB), since it contains an exception vector table which
>>> requires this alignment at a minimum. To ensure that the alignment is 
>>> honored
>>> at the FFS and FV level, TE=Auto is used for the alignment [Rule] for these
>>> modules (patch #1). Each VZ header is the start of a TE image.
>>>
>>> 00000160  56 5a 64 aa 03 0b 88 01  08 08 00 00 00 08 00 00  
>>> |VZd.............|
>>> 0000c0a0  56 5a 64 aa 03 0b 88 01  a8 02 00 00 40 02 00 00  
>>> |VZd.........@...|
>>> 00027e60  56 5a 64 aa 03 0b 88 01  c0 02 00 00 40 02 00 00  
>>> |VZd.........@...|
>>> 0002f720  56 5a 64 aa 03 0b 88 01  c0 02 00 00 40 02 00 00  
>>> |VZd.........@...|
>>> 00039be0  56 5a 64 aa 03 0b 88 01  c0 02 00 00 40 02 00 00  
>>> |VZd.........@...|
>>> 00040e20  56 5a 64 aa 03 0b 88 01  c0 02 00 00 40 02 00 00  
>>> |VZd.........@...|
>>> 0004c360  56 5a 64 aa 03 0b 88 01  c0 02 00 00 40 02 00 00  
>>> |VZd.........@...|
>>> 00054960  56 5a 64 aa 03 0b 88 01  c0 02 00 00 40 02 00 00  
>>> |VZd.........@...|
>>> ...
>>>
>>> Once we increase the section alignment to 4 KB, we can clearly observe that 
>>> due
>>> to the rounding in various places, we waste around 20 KB per module.
>>>
>>> 00001160  56 5a 64 aa 04 0b 88 01  08 10 00 00 00 10 00 00  
>>> |VZd.............|
>>> 00011160  56 5a 64 aa 04 0b 88 01  68 10 00 00 00 10 00 00  
>>> |VZd.....h.......|
>>> 00032160  56 5a 64 aa 04 0b 88 01  80 10 00 00 00 10 00 00  
>>> |VZd.............|
>>> 0003f160  56 5a 64 aa 04 0b 88 01  80 10 00 00 00 10 00 00  
>>> |VZd.............|
>>> 0004e160  56 5a 64 aa 04 0b 88 01  80 10 00 00 00 10 00 00  
>>> |VZd.............|
>>> 0005a160  56 5a 64 aa 04 0b 88 01  80 10 00 00 00 10 00 00  
>>> |VZd.............|
>>> 0006a160  56 5a 64 aa 04 0b 88 01  80 10 00 00 00 10 00 00  
>>> |VZd.............|
>>> 00078160  56 5a 64 aa 04 0b 88 01  80 10 00 00 00 10 00 00  
>>> |VZd.............|
>>> ...
>>>
>>> However, for a fair comparison, we should choose a baseline that already 
>>> has the
>>> small model enabled. So after applying only patches #7, and #8, which 
>>> enable the
>>> small C model and set the minimum section alignment in the linker script, we
>>> observe the following memory footprint of the XIP region.
>>>
>>> 00001160  56 5a 64 aa 04 0b 88 01  08 10 00 00 00 10 00 00  
>>> |VZd.............|
>>> 00011160  56 5a 64 aa 04 0b 88 01  68 10 00 00 00 10 00 00  
>>> |VZd.....h.......|
>>> 00031160  56 5a 64 aa 04 0b 88 01  80 10 00 00 00 10 00 00  
>>> |VZd.............|
>>> 0003e160  56 5a 64 aa 04 0b 88 01  80 10 00 00 00 10 00 00  
>>> |VZd.............|
>>> 0004d160  56 5a 64 aa 04 0b 88 01  80 10 00 00 00 10 00 00  
>>> |VZd.............|
>>> 00059160  56 5a 64 aa 04 0b 88 01  80 10 00 00 00 10 00 00  
>>> |VZd.............|
>>> 00069160  56 5a 64 aa 04 0b 88 01  80 10 00 00 00 10 00 00  
>>> |VZd.............|
>>> 00076160  56 5a 64 aa 04 0b 88 01  80 10 00 00 00 10 00 00  
>>> |VZd.............|
>>> ...
>>>
>>> Patch #2 moves the contents of the .debug section to .data (or .text if 
>>> .data
>>> is empty). This allows the .debug section to be dropped, and will also allow
>>> .reloc sections to be dropped for RELOCS_STRIPPED XIP images (since this is 
>>> only
>>> possible if the .reloc section is the last one in the image). Note that a
>>> previous version of this patch has been sent to the list already.
>>>
>>> 00001160  56 5a 64 aa 03 0b 88 01  08 10 00 00 00 10 00 00  
>>> |VZd.............|
>>> 0000f160  56 5a 64 aa 03 0b 88 01  68 10 00 00 00 10 00 00  
>>> |VZd.....h.......|
>>> 0002e160  56 5a 64 aa 03 0b 88 01  80 10 00 00 00 10 00 00  
>>> |VZd.............|
>>> 0003a160  56 5a 64 aa 03 0b 88 01  80 10 00 00 00 10 00 00  
>>> |VZd.............|
>>> 00048160  56 5a 64 aa 03 0b 88 01  80 10 00 00 00 10 00 00  
>>> |VZd.............|
>>> 00053160  56 5a 64 aa 03 0b 88 01  80 10 00 00 00 10 00 00  
>>> |VZd.............|
>>> 00062160  56 5a 64 aa 03 0b 88 01  80 10 00 00 00 10 00 00  
>>> |VZd.............|
>>> 0006e160  56 5a 64 aa 03 0b 88 01  80 10 00 00 00 10 00 00  
>>> |VZd.............|
>>> ...
>>>
>>> Patch #3 updates the PE/COFF generation code so that the PE/COFF header (the
>>> second one after the DOS header) is moved as far as possible towards the 
>>> start
>>> of the payload (the first section). This will put all the padding before it,
>>> which allows us to drop it from the TE version of the image entirely. This
>>> change by itself does not reduce the overall footprint, but it will help us
>>> get rid of some of the padding later on.
>>>
>>> 00001f30  00 00 00 00 cc c0 00 12  56 5a 64 aa 03 0b 60 0f  
>>> |........VZd...`.|
>>> 0000ff30  00 00 00 00 cc d0 01 12  56 5a 64 aa 03 0b 60 0f  
>>> |........VZd...`.|
>>> 0002ef30  00 00 00 00 cc a0 00 12  56 5a 64 aa 03 0b 60 0f  
>>> |........VZd...`.|
>>> 0003af30  00 00 00 00 cc c0 00 12  56 5a 64 aa 03 0b 60 0f  
>>> |........VZd...`.|
>>> 00048f30  00 00 00 00 cc 90 00 12  56 5a 64 aa 03 0b 60 0f  
>>> |........VZd...`.|
>>> 00053f30  00 00 00 00 cc d0 00 12  56 5a 64 aa 03 0b 60 0f  
>>> |........VZd...`.|
>>> 00062f30  00 00 00 00 cc a0 00 12  56 5a 64 aa 03 0b 60 0f  
>>> |........VZd...`.|
>>> 0006ef30  00 00 00 00 cc 10 01 12  56 5a 64 aa 03 0b 60 0f  
>>> |........VZd...`.|
>>> ...
>>>
>>> (note that the relative offsets of the modules are the same, e.g., the first
>>> section of the second module still starts at 0x10000)
>>>
>>> Patches #4 - #5 address the redundant padding that is present in FFS files
>>> included as XIP modules in firmware volumes. By using a GUID identifiable
>>> section for the alignment that is added to the FFS for alignment purposes 
>>> only,
>>> we can later adjust this section by reducing its size, and line up the 
>>> actual
>>> payload to the global alignment with respect to the base of the FV.
>>>
>>> 00000f30  00 00 00 00 cc c0 00 12  56 5a 64 aa 03 0b 60 0f  
>>> |........VZd...`.|
>>> 0000df30  00 00 00 00 cc d0 01 12  56 5a 64 aa 03 0b 60 0f  
>>> |........VZd...`.|
>>> 0002bf30  00 00 00 00 cc a0 00 12  56 5a 64 aa 03 0b 60 0f  
>>> |........VZd...`.|
>>> 00036f30  00 00 00 00 cc c0 00 12  56 5a 64 aa 03 0b 60 0f  
>>> |........VZd...`.|
>>> 00043f30  00 00 00 00 cc 90 00 12  56 5a 64 aa 03 0b 60 0f  
>>> |........VZd...`.|
>>> 0004df30  00 00 00 00 cc d0 00 12  56 5a 64 aa 03 0b 60 0f  
>>> |........VZd...`.|
>>> 0005bf30  00 00 00 00 cc a0 00 12  56 5a 64 aa 03 0b 60 0f  
>>> |........VZd...`.|
>>> 00066f30  00 00 00 00 cc 10 01 12  56 5a 64 aa 03 0b 60 0f  
>>> |........VZd...`.|
>>> ...
>>>
>>> Finally, patch #6 introduces a dedicated linker script for XIP modules, and
>>> wires it up for ArmVirtQemu. The trick here is that, since these are 
>>> basically
>>> ROM modules anyway, we can put all program code in the .text section, and 
>>> avoid
>>> any rounding associated with having a separate .data section. (Ideally, 
>>> having
>>> separate linker options for XIP modules should be supported generically by 
>>> the
>>> base tools, but for this proof of concept, I added it to the platform .DSC)
>>>
>>> 00000f30  00 00 00 00 cc b0 00 12  56 5a 64 aa 02 0b 60 0f  
>>> |........VZd...`.|
>>> 0000cf30  00 00 00 00 cc c0 01 12  56 5a 64 aa 02 0b 60 0f  
>>> |........VZd...`.|
>>> 00029f30  00 00 00 00 cc 90 00 12  56 5a 64 aa 02 0b 60 0f  
>>> |........VZd...`.|
>>> 00033f30  00 00 00 00 cc b0 00 12  56 5a 64 aa 02 0b 60 0f  
>>> |........VZd...`.|
>>> 0003ff30  00 00 00 00 cc 80 00 12  56 5a 64 aa 02 0b 60 0f  
>>> |........VZd...`.|
>>> 00048f30  00 00 00 00 cc c0 00 12  56 5a 64 aa 02 0b 60 0f  
>>> |........VZd...`.|
>>> 00055f30  00 00 00 00 cc a0 00 12  56 5a 64 aa 02 0b 60 0f  
>>> |........VZd...`.|
>>> 00060f30  00 00 00 00 cc 00 01 12  56 5a 64 aa 02 0b 60 0f  
>>> |........VZd...`.|
>>> ...
>>>
>>> The bottom line is that we are still losing around 40 KB on padding in 
>>> total,
>>> but this should be tolerable compared to the 20 KB per module == 160 KB 
>>> that we
>>> started out with.
>>>
>>> Patch #7 and #8 enable the small C model for AARCH64.
>>>
>>> Ard Biesheuvel (8):
>>>  ArmVirtPkg: use 'auto' alignment for XIP modules
>>>  BaseTools/GenFw: move .debug contents to .data to save space
>>>  BaseTools/GenFw: move PE/COFF header closer to payload
>>>  BaseTools: use GUID identifiable section for FFS alignment padding
>>>  BaseTools/GenFv: optimize away redundant padding
>>>  ArmVirtPkg: merge .text and .data in 4 KB section aligned XIP modules
>>>  BaseTools/GenFw: allow AArch64 small model relocations
>>>  BaseTools: enable small C model for AARCH64/GCC
>>>
>>> ArmVirtPkg/ArmVirt.dsc.inc                                                  
>>>                |   3 +
>>> ArmVirtPkg/ArmVirtQemu.fdf                                                  
>>>                |   6 +-
>>> ArmVirtPkg/ArmVirtXen.fdf                                                   
>>>                |   6 +-
>>> BaseTools/Conf/tools_def.template                                           
>>>                |   2 +-
>>> BaseTools/Scripts/gcc-aarch64-ld-script                                     
>>>                |   6 +-
>>> BaseTools/Scripts/gcc-aarch64-xip-ld-script                                 
>>>                |  13 ++
>>> BaseTools/Source/C/GenFfs/GenFfs.c                                          
>>>                |  44 +++---
>>> BaseTools/Source/C/GenFv/GenFvInternalLib.c                                 
>>>                | 160 +++++++++++++++++++-
>>> BaseTools/Source/C/GenFw/Elf32Convert.c                                     
>>>                |  66 ++++----
>>> BaseTools/Source/C/GenFw/Elf64Convert.c                                     
>>>                | 145 +++++++++---------
>>> BaseTools/Source/C/GenSec/GenSec.c                                          
>>>                |  47 +++---
>>> BaseTools/Source/C/Include/{IndustryStandard/pci23.h => 
>>> Guid/FfsSectionAlignmentPadding.h} |  18 +--
>>> 12 files changed, 352 insertions(+), 164 deletions(-)
>>> create mode 100644 BaseTools/Scripts/gcc-aarch64-xip-ld-script
>>> copy BaseTools/Source/C/Include/{IndustryStandard/pci23.h => 
>>> Guid/FfsSectionAlignmentPadding.h} (51%)
>>>
>>> --
>>> 1.9.1
>>>
>>> _______________________________________________
>>> edk2-devel mailing list
>>> edk2-devel@lists.01.org
>>> https://lists.01.org/mailman/listinfo/edk2-devel
>>> _______________________________________________
>>> edk2-devel mailing list
>>> edk2-devel@lists.01.org
>>> https://lists.01.org/mailman/listinfo/edk2-devel
>>
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel

Reply via email to