Re: CREL relocation format for ELF (was: RELLEB)

2024-03-28 Thread Fangrui Song via Gcc
On Thu, Mar 28, 2024 at 6:04 AM Alan Modra  wrote:
>
> On Fri, Mar 22, 2024 at 06:51:41PM -0700, Fangrui Song wrote:
> > On Thu, Mar 14, 2024 at 5:16 PM Fangrui Song  wrote:
> > > I propose RELLEB, a new format offering significant file size
> > > reductions: 17.2% (x86-64), 16.5% (aarch64), and even 32.4% (riscv64)!
> > >
> > > Your thoughts on RELLEB are welcome!
>
> Does anyone really care about relocatable object file size?  If they
> do, wouldn't they be better off using a compressed file system?

Yes, many people care about relocatable file sizes.

* Relocation sizes affect DWARF evolution and we were/are using an
imperfect metric due to overly bloated REL/RELA. .debug_str_offsets
does not get much traction in GCC, probably partly because it needs
relocations. DWARF v5 introduced changes to keep relocations small.
Many are good on their own, but we need to be cautious of relocation
concerns causing us to pick the wrong trade-off in the future.
* On many Linux targets, Clang emits .llvm_addrsig by default to allow
ld.lld --icf=safe. .llvm_addrsig stores symbol indexes in ULEB128
instead of using relocations to prevent a significant size increase.
* Static relocations make .a files larger.
* Some users care about the build artifact size due to limited disk space.
  + I believe part of the reasons -ffunction-sections -fdata-sections
do not get more adoption is due to the relocatable file size concern.
  + I prefer to place build directories in Linux tmpfs. 12G vs 10G in
memory matters to me :)
  + Large .o files => more IO amount. This may be more significant
when the storage is remote.


Re: CREL relocation format for ELF (was: RELLEB)

2024-03-28 Thread Alan Modra via Gcc
On Fri, Mar 22, 2024 at 06:51:41PM -0700, Fangrui Song wrote:
> On Thu, Mar 14, 2024 at 5:16 PM Fangrui Song  wrote:
> > I propose RELLEB, a new format offering significant file size
> > reductions: 17.2% (x86-64), 16.5% (aarch64), and even 32.4% (riscv64)!
> >
> > Your thoughts on RELLEB are welcome!

Does anyone really care about relocatable object file size?  If they
do, wouldn't they be better off using a compressed file system?

-- 
Alan Modra
Australia Development Lab, IBM


Re: CREL relocation format for ELF (was: RELLEB)

2024-03-28 Thread Fangrui Song via Gcc
On Fri, Mar 22, 2024 at 6:51 PM Fangrui Song  wrote:
>
> On Thu, Mar 14, 2024 at 5:16 PM Fangrui Song  wrote:
> >
> > The relocation formats REL and RELA for ELF are inefficient. In a
> > release build of Clang for x86-64, .rela.* sections consume a
> > significant portion (approximately 20.9%) of the file size.
> >
> > I propose RELLEB, a new format offering significant file size
> > reductions: 17.2% (x86-64), 16.5% (aarch64), and even 32.4% (riscv64)!
> >
> > Your thoughts on RELLEB are welcome!
> >
> > Detailed analysis:
> > https://maskray.me/blog/2024-03-09-a-compact-relocation-format-for-elf
> > generic ABI (ELF specification):
> > https://groups.google.com/g/generic-abi/c/yb0rjw56ORw
> > binutils feature request: 
> > https://sourceware.org/bugzilla/show_bug.cgi?id=31475
> > LLVM: 
> > https://discourse.llvm.org/t/rfc-relleb-a-compact-relocation-format-for-elf/77600
> >
> > Implementation primarily involves binutils changes. Any volunteers?
> > For GCC, a driver option like -mrelleb in my Clang prototype would be
> > needed. The option instructs the assembler to use RELLEB.
>
> The format was tentatively named RELLEB. As I refine the original pure
> LEB-based format, “RELLEB” might not be the most fitting name.
>
> I have switched to SHT_CREL/DT_CREL/.crel and updated
> https://maskray.me/blog/2024-03-09-a-compact-relocation-format-for-elf
> and
> https://groups.google.com/g/generic-abi/c/yb0rjw56ORw/m/eiBcYxSfAQAJ
>
> The new format is simpler and better than RELLEB even in the absence
> of the shifted offset technique.
>
> Dynamic relocations using CREL are even smaller than Android's packed
> relocations.
>
> // encodeULEB128(uint64_t, raw_ostream );
> // encodeSLEB128(int64_t, raw_ostream );
>
> Elf_Addr offsetMask = 8, offset = 0, addend = 0;
> uint32_t symidx = 0, type = 0;
> for (const Reloc  : relocs)
>   offsetMask |= crels[i].r_offset;
> int shift = std::countr_zero(offsetMask)
> encodeULEB128(relocs.size() * 4 + shift, os);
> for (const Reloc  : relocs) {
>   Elf_Addr deltaOffset = (rel.r_offset - offset) >> shift;
>   uint8_t b = deltaOffset * 8 + (symidx != rel.r_symidx) +
>   (type != rel.r_type ? 2 : 0) + (addend != rel.r_addend ? 4 : 0);
>   if (deltaOffset < 0x10) {
> os << char(b);
>   } else {
> os << char(b | 0x80);
> encodeULEB128(deltaOffset >> 4, os);
>   }
>   if (b & 1) {
> encodeSLEB128(static_cast(rel.r_symidx - symidx), os);
> symidx = rel.r_symidx;
>   }
>   if (b & 2) {
> encodeSLEB128(static_cast(rel.r_type - type), os);
> type = rel.r_type;
>   }
>   if (b & 4) {
> encodeSLEB128(std::make_signed_t(rel.r_addend - addend), os);
> addend = rel.r_addend;
>   }
> }
>
> ---
>
> While alternatives like PrefixVarInt (or a suffix-based variant) might
> excel when encoding larger integers, LEB128 offers advantages when
> most integers fit within one or two bytes, as it avoids the need for
> shift operations in the common one-byte representation.
>
> While we could utilize zigzag encoding (i>>31) ^ (i<<1) to convert
> SLEB128-encoded type/addend to use ULEB128 instead, the generate code
> is inferior to or on par with SLEB128 for one-byte encodings.


We can introduce a gas option --crel, then users can specify `gcc
-Wa,--crel a.c` (-flto also gets -Wa, options).

I propose that we add another gas option --implicit-addends-for-data
(does the name look good?) to allow non-code sections to use implicit
addends to save space
(https://sourceware.org/PR31567).
Using implicit addends primarily benefits debug sections such as
.debug_str_offsets, .debug_names, .debug_addr, .debug_line, but also
data sections such as .eh_frame, .data., .data.rel.ro, .init_array.

-Wa,--implicit-addends-for-data can be used on its own (6.4% .o
reduction in a clang -g -g0 -gpubnames build)   or together with
CREL to achieve more incredible size reduction, one single byte for
most .debug_* relocations!
With CREL, concerns of debug section relocations will become a thing
of the past.