Re: [PATCH v9] pgo: add clang's Profile Guided Optimization infrastructure

2021-04-07 Thread Fāng-ruì Sòng
On Wed, Apr 7, 2021 at 2:22 PM Kees Cook  wrote:
>
> On Wed, Apr 07, 2021 at 02:17:04PM -0700, 'Bill Wendling' via Clang Built 
> Linux wrote:
> > From: Sami Tolvanen 
> >
> > Enable the use of clang's Profile-Guided Optimization[1]. To generate a
> > profile, the kernel is instrumented with PGO counters, a representative
> > workload is run, and the raw profile data is collected from
> > /sys/kernel/debug/pgo/profraw.
> >
> > The raw profile data must be processed by clang's "llvm-profdata" tool
> > before it can be used during recompilation:
> >
> >   $ cp /sys/kernel/debug/pgo/profraw vmlinux.profraw
> >   $ llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw
> >
> > Multiple raw profiles may be merged during this step.
> >
> > The data can now be used by the compiler:
> >
> >   $ make LLVM=1 KCFLAGS=-fprofile-use=vmlinux.profdata ...
> >
> > This initial submission is restricted to x86, as that's the platform we
> > know works. This restriction can be lifted once other platforms have
> > been verified to work with PGO.
> >
> > Note that this method of profiling the kernel is clang-native, unlike
> > the clang support in kernel/gcov.
> >
> > [1] https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization
> >
> > Signed-off-by: Sami Tolvanen 
> > Co-developed-by: Bill Wendling 
> > Signed-off-by: Bill Wendling 
> > Tested-by: Nick Desaulniers 
> > Reviewed-by: Nick Desaulniers 
> > Reviewed-by: Fangrui Song 
>
> Thanks for sending this again! I'm looking forward to using it.

Yay. Quite excited about that:)

> Masahiro and Andrew, unless one of you would prefer to take this in your
> tree, I figure I can snag it to send to Linus.
>
> Anyone else have feedback?

I have carefully compared the implementation and the original
implementation in llvm-project/compiler-rt.
This looks great.
Also very happy about the cleaner include/asm-generic/vmlinux.lds.h now.

Just adding a note here for folks who may want to help test the
not-yet-common option LD_DEAD_CODE_DATA_ELIMINATION.
--gc-sections may not work perfectly with some advanced PGO features
before Clang 13 (not broken but probably just in an inferior state).
There were some upstream changes in this area recently and I think as
of my https://reviews.llvm.org/D97649 things should be perfect with GC
now.
This does not deserve any comment without more testing, though.

Thanks for already carrying my Reviewed-by tag.

> Thanks!
>
> -Kees
>
> --
> Kees Cook



-- 
宋方睿


Re: [PATCH] Replace __toc_start + 0x8000 with .TOC.

2021-03-06 Thread Fāng-ruì Sòng
On Sat, Mar 6, 2021 at 10:25 PM Segher Boessenkool
 wrote:
>
> Hi!
>
> On Sat, Mar 06, 2021 at 09:14:33PM -0800, Fangrui Song wrote:
> > TOC relocations are like GOT relocations on other architectures.
> > However, unlike other architectures, GNU ld's ppc64 port defines .TOC.
> > relative to the .got output section instead of the linker synthesized
> > .got input section. LLD defines .TOC. as the .got input section plus
> > 0x8000. When CONFIG_PPC_OF_BOOT_TRAMPOLINE=y,
> > arch/powerpc/kernel/prom_init.o is built, and LLD computed .TOC. can be
> > different from __toc_start defined by the linker script.
> >
> > Simplify kernel_toc_addr with asm label .TOC. so that we can get rid of
> > __toc_start.
> >
> > With this change, powernv_defconfig with CONFIG_PPC_OF_BOOT_TRAMPOLINE=y
> > is bootable with LLD. There is still an untriaged issue with Alexey's
> > configuration.
>
> Do you have any explanation why this *does* work, while the original
> doesn't?  Some explanation that says *what* is wrong.  To me it doesn't
> look like the kernel script is.
>
>
> Segher

The kernel code probably wants to access .TOC. (the TOC base symbol)
via __toc_start+0x8000.
If the kernel understood TOC base is different from the linker
understood TOC base (.TOC.), there should be a problem.
By using .TOC. in the kernel code, the two concepts are guaranteed to match.


Re: [PATCH 1/2] Makefile: Remove '--gcc-toolchain' flag

2021-03-04 Thread Fāng-ruì Sòng
On Wed, Mar 3, 2021 at 3:07 PM Fangrui Song  wrote:
>
>
> On 2021-03-03, Masahiro Yamada wrote:
> >Hi.
> >
> >On Wed, Mar 3, 2021 at 6:44 AM Fangrui Song  wrote:
> >>
> >> Reviewed-by: Fangrui Song 
> >>
> >> Thanks for the clean-up!
> >> --gcc-toolchain= is an obsscure way searching for GCC installation 
> >> prefixes (--prefix).
> >> The logic is complex and different for different 
> >> distributions/architectures.
> >>
> >> If we specify --prefix= (-B) explicitly, --gcc-toolchain is not needed.
> >
> >
> >I tested this, and worked for me too.
> >
> >Before applying this patch, could you please
> >help me understand the logic?
> >
> >
> >
> >
> >I checked the manual
> >(https://clang.llvm.org/docs/ClangCommandLineReference.html#cmdoption-clang-b-dir)
> >
> >
> >
> >-B, --prefix , --prefix=
> >Add  to search path for binaries and object files used implicitly
> >
> >--gcc-toolchain=, -gcc-toolchain 
> >Use the gcc toolchain at the given directory
> >
> >
> >Hmm, this description is too concise
> >to understand how it works...
> >
> >
> >
> >I use Ubuntu 20.10.
> >
> >I use distro's default clang
> >located in /usr/bin/clang.
> >
> >I place my aarch64 linaro toolchain in
> >/home/masahiro/tools/aarch64-linaro-7.5/bin/aarch64-linux-gnu-gcc,
> >which is not in my PATH environment.
> >
> >
> >
> >
> >From my some experiments,
> >
> >clang --target=aarch64-linux-gnu -no-integrated-as \
> >--prefix=/home/masahiro/tools/aarch64-linaro-7.5/bin/aarch64-linux-gnu-  ...
> >
> >works almost equivalent to
> >
> >PATH=/home/masahiro/tools/aarch64-linaro-7.5/bin:$PATH \
> >clang --target=aarch64-linux-gnu -no-integrated-as ...
> >
> >
> >Then, clang will pick up aarch64-linux-gnu-as
> >found in the search path.
> >
> >Is this correct?
> >
> >
> >On the other hand, I could not understand
> >what the purpose of --gcc-toolchain= is.
> >
> >
> >Even if I add --gcc-toolchain=/home/masahiro/tools/aarch64-linaro-7.5,
> >it does not make any difference, and it is completely useless.
> >
> >
> >I read the comment from stephenhines:
> >https://github.com/ClangBuiltLinux/linux/issues/78
> >
> >How could --gcc-toolchain be used
> >in a useful way?
>
> --gcc-toolchain was introduced in
> https://reviews.llvm.org/rG1af7c219c7113a35415651127f05cdf056b63110
> to provide a flexible alternative to autoconf configure-time 
> --with-gcc-toolchain (now cmake variable GCC_INSTALL_PREFIX).
>
> I agree the option is confusing, the documentation is poor, and probably very 
> few people understand what it does.
> I apologize that my previous reply is not particular correct.
> So the more correct answer is below:
>
>
> A --prefix option can specify either of
>
> 1) A directory (for something like /a/b/lib/gcc/arm-linux-androideabi, this 
> should be /a/b, the parent directory of 'lib')
> 2) A path fragment like /usr/bin/aarch64-linux-gnu-
>
> The directory values of the --prefix list and --gcc-toolchain are used to 
> detect GCC installation directories. The directory is used to fetch include 
> directories, system library directories and binutils directories (as, 
> objcopy).
> (See below, Linux kernel only needs the binutils executables, so the 
> include/library logic is really useless to us)
>
> The logic is around 
> https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/ToolChains/Gnu.cpp#L1910
>
>Prefixes = --prefix/-B list (only the directory subset is effective)
>StringRef GCCToolchainDir = --gcc-toolchain= or CMake variable 
> GCC_INSTALL_PREFIX
>if (GCCToolchainDir != "") {
>  Prefixes.push_back(std::string(GCCToolchainDir));
>} else {
>  if (!D.SysRoot.empty()) {
>Prefixes.push_back(D.SysRoot);
>// Add D.SysRoot+"/usr" to Prefixes. Some distributions add more 
> directories.
>AddDefaultGCCPrefixes(TargetTriple, Prefixes, D.SysRoot);
>  }
>
>  // D.InstalledDir is the directory of the clang executable, e.g. /usr/bin
>  Prefixes.push_back(D.InstalledDir + "/..");
>
>  if (D.SysRoot.empty())
>AddDefaultGCCPrefixes(TargetTriple, Prefixes, D.SysRoot);
>}
>
>// Gentoo / ChromeOS specific logic.
>// I think this block is misplaced.
>if (GCCToolchainDir == "" || GCCToolchainDir == D.SysRoot + "/usr") {
>  ...
>}
>
>// Loop over the various components which exist and select the best GCC
>// installation available. GCC installs are ranked by version number.
>Version = GCCVersion::Parse("0.0.0");
>for (const std::string  : Prefixes) {
>  auto  = D.getVFS();
>  if (!VFS.exists(Prefix))
>continue;
>
>  // CandidateLibDirs is a subset of {/lib64, /lib32, /lib}.
>  for (StringRef Suffix : CandidateLibDirs) {
>const std::string LibDir = Prefix + Suffix.str();
>if (!VFS.exists(LibDir))
>  continue;
>bool GCCDirExists = VFS.exists(LibDir + "/gcc");
>bool GCCCrossDirExists = VFS.exists(LibDir + "/gcc-cross");
>
>// Precise match. Detect 

Re: ERROR: INT DW_ATE_unsigned_1 Error emitting BTF type

2021-02-05 Thread Fāng-ruì Sòng
On Fri, Feb 5, 2021 at 11:21 AM Sedat Dilek  wrote:
>
> On Fri, Feb 5, 2021 at 8:15 PM Sedat Dilek  wrote:
> >
> > On Fri, Feb 5, 2021 at 8:10 PM Yonghong Song  wrote:
> > >
> > >
> > >
> > > On 2/5/21 11:06 AM, Sedat Dilek wrote:
> > > > On Fri, Feb 5, 2021 at 7:53 PM Sedat Dilek  
> > > > wrote:
> > > >>
> > > >> On Fri, Feb 5, 2021 at 6:48 PM Sedat Dilek  
> > > >> wrote:
> > > >>>
> > > >>> On Fri, Feb 5, 2021 at 4:28 PM Arnaldo Carvalho de Melo
> > > >>>  wrote:
> > > 
> > >  Em Fri, Feb 05, 2021 at 04:23:59PM +0100, Sedat Dilek escreveu:
> > > > On Fri, Feb 5, 2021 at 3:41 PM Sedat Dilek  
> > > > wrote:
> > > >>
> > > >> On Fri, Feb 5, 2021 at 3:37 PM Sedat Dilek  
> > > >> wrote:
> > > >>>
> > > >>> Hi,
> > > >>>
> > > >>> when building with pahole v1.20 and binutils v2.35.2 plus Clang
> > > >>> v12.0.0-rc1 and DWARF-v5 I see:
> > > >>> ...
> > > >>> + info BTF .btf.vmlinux.bin.o
> > > >>> + [  != silent_ ]
> > > >>> + printf   %-7s %s\n BTF .btf.vmlinux.bin.o
> > > >>>   BTF .btf.vmlinux.bin.o
> > > >>> + LLVM_OBJCOPY=/opt/binutils/bin/objcopy /opt/pahole/bin/pahole -J
> > > >>> .tmp_vmlinux.btf
> > > >>> [115] INT DW_ATE_unsigned_1 Error emitting BTF type
> > > >>> Encountered error while encoding BTF.
> > > >>
> > > >> Grepping the pahole sources:
> > > >>
> > > >> $ git grep DW_ATE
> > > >> dwarf_loader.c: bt->is_bool = encoding == DW_ATE_boolean;
> > > >> dwarf_loader.c: bt->is_signed = encoding == DW_ATE_signed;
> > > >>
> > > >> Missing DW_ATE_unsigned encoding?
> > > >>
> > > >
> > > > Checked the LLVM sources:
> > > >
> > > > clang/lib/CodeGen/CGDebugInfo.cpp:Encoding =
> > > > llvm::dwarf::DW_ATE_unsigned_char;
> > > > clang/lib/CodeGen/CGDebugInfo.cpp:Encoding = 
> > > > llvm::dwarf::DW_ATE_unsigned;
> > > > clang/lib/CodeGen/CGDebugInfo.cpp:Encoding =
> > > > llvm::dwarf::DW_ATE_unsigned_fixed;
> > > > clang/lib/CodeGen/CGDebugInfo.cpp:
> > > >? llvm::dwarf::DW_ATE_unsigned
> > > > ...
> > > > lld/test/wasm/debuginfo.test:CHECK-NEXT:
> > > > DW_AT_encoding
> > > >   (DW_ATE_unsigned)
> > > >
> > > > So, I will switch from GNU ld.bfd v2.35.2 to LLD-12.
> > > 
> > >  Thanks for the research, probably your conclusion is correct, can 
> > >  you go
> > >  the next step and add that part and check if the end result is the
> > >  expected one?
> > > 
> > > >>>
> > > >>> Still building...
> > > >>>
> > > >>> Can you give me a hand on what has to be changed in dwarves/pahole?
> > > >>>
> > > >>> I guess switching from ld.bfd to ld.lld will show the same ERROR.
> > > >>>
> > > >>
> > > >> This builds successfully - untested:
> > > >>
> > > >> $ git diff
> > > >> diff --git a/btf_loader.c b/btf_loader.c
> > > >> index ec286f413f36..a39edd3362db 100644
> > > >> --- a/btf_loader.c
> > > >> +++ b/btf_loader.c
> > > >> @@ -107,6 +107,7 @@ static struct base_type *base_type__new(strings_t
> > > >> name, uint32_t attrs,
> > > >> bt->bit_size = size;
> > > >> bt->is_signed = attrs & BTF_INT_SIGNED;
> > > >> bt->is_bool = attrs & BTF_INT_BOOL;
> > > >> +   bt->is_unsigned = attrs & BTF_INT_UNSIGNED;
> > > >> bt->name_has_encoding = false;
> > > >> bt->float_type = float_type;
> > > >> }
> > > >> diff --git a/ctf.h b/ctf.h
> > > >> index 25b79892bde3..9e47c3c74677 100644
> > > >> --- a/ctf.h
> > > >> +++ b/ctf.h
> > > >> @@ -100,6 +100,7 @@ struct ctf_full_type {
> > > >> #define CTF_TYPE_INT_CHAR  0x2
> > > >> #define CTF_TYPE_INT_BOOL  0x4
> > > >> #define CTF_TYPE_INT_VARARGS   0x8
> > > >> +#define CTF_TYPE_INT_UNSIGNED  0x16
> > > >>
> > > >> #define CTF_TYPE_FP_ATTRS(VAL) ((VAL) >> 24)
> > > >> #define CTF_TYPE_FP_OFFSET(VAL)(((VAL) >> 16) & 0xff)
> > > >> diff --git a/dwarf_loader.c b/dwarf_loader.c
> > > >> index b73d7867e1e6..79d40f183c24 100644
> > > >> --- a/dwarf_loader.c
> > > >> +++ b/dwarf_loader.c
> > > >> @@ -473,6 +473,7 @@ static struct base_type *base_type__new(Dwarf_Die
> > > >> *die, struct cu *cu)
> > > >> bt->is_bool = encoding == DW_ATE_boolean;
> > > >> bt->is_signed = encoding == DW_ATE_signed;
> > > >> bt->is_varargs = false;
> > > >> +   bt->is_unsigned = encoding == DW_ATE_unsigned;
> > > >> bt->name_has_encoding = true;
> > > >> }
> > > >>
> > > >> diff --git a/dwarves.h b/dwarves.h
> > > >> index 98caf1abc54d..edf32d2e6f80 100644
> > > >> --- a/dwarves.h
> > > >> +++ b/dwarves.h
> > > >> @@ -1261,6 +1261,7 @@ struct base_type {
> > > >> uint8_t is_signed:1;
> > > >> uint8_t is_bool:1;
> > > >> uint8_t is_varargs:1;
> > > >> +   uint8_t 

Re: [PATCH v7 2/2] Kbuild: implement support for DWARF v5

2021-01-30 Thread Fāng-ruì Sòng
On Sat, Jan 30, 2021 at 3:10 PM Sedat Dilek  wrote:
>
> On Sat, Jan 30, 2021 at 1:44 AM Nick Desaulniers
>  wrote:
> >
> > DWARF v5 is the latest standard of the DWARF debug info format.
> >
> > Feature detection of DWARF5 is onerous, especially given that we've
> > removed $(AS), so we must query $(CC) for DWARF5 assembler directive
> > support.
> >
> > The DWARF version of a binary can be validated with:
> > $ llvm-dwarfdump vmlinux | head -n 4 | grep version
> > or
> > $ readelf --debug-dump=info vmlinux 2>/dev/null | grep Version
> >
> > DWARF5 wins significantly in terms of size when mixed with compression
> > (CONFIG_DEBUG_INFO_COMPRESSED).
> >
> > 363Mvmlinux.clang12.dwarf5.compressed
> > 434Mvmlinux.clang12.dwarf4.compressed
> > 439Mvmlinux.clang12.dwarf2.compressed
> > 457Mvmlinux.clang12.dwarf5
> > 536Mvmlinux.clang12.dwarf4
> > 548Mvmlinux.clang12.dwarf2
> >
> > 515Mvmlinux.gcc10.2.dwarf5.compressed
> > 599Mvmlinux.gcc10.2.dwarf4.compressed
> > 624Mvmlinux.gcc10.2.dwarf2.compressed
> > 630Mvmlinux.gcc10.2.dwarf5
> > 765Mvmlinux.gcc10.2.dwarf4
> > 809Mvmlinux.gcc10.2.dwarf2
> >
> > Though the quality of debug info is harder to quantify; size is not a
> > proxy for quality.
> >
> > Jakub notes:
> >   All [GCC] 5.1 - 6.x did was start accepting -gdwarf-5 as experimental
> >   option that enabled some small DWARF subset (initially only a few
> >   DW_LANG_* codes newly added to DWARF5 drafts).  Only GCC 7 (released
> >   after DWARF 5 has been finalized) started emitting DWARF5 section
> >   headers and got most of the DWARF5 changes in...
> >
> > Version check GCC so that we don't need to worry about the difference in
> > command line args between GNU readelf and llvm-readelf/llvm-dwarfdump to
> > validate the DWARF Version in the assembler feature detection script.
> >
> > GNU `as` only recently gained support for specifying -gdwarf-5, so when
> > compiling with Clang but without Clang's integrated assembler
> > (LLVM_IAS=1 is not set), explicitly add -Wa,-gdwarf-5 to DEBUG_CFLAGS.
> >
> > Disabled for now if CONFIG_DEBUG_INFO_BTF is set; pahole doesn't yet
> > recognize the new additions to the DWARF debug info. Thanks to Sedat for
> > the report.
> >
> > Link: http://www.dwarfstd.org/doc/DWARF5.pdf
> > Reported-by: Sedat Dilek 
> > Suggested-by: Arvind Sankar 
> > Suggested-by: Caroline Tice 
> > Suggested-by: Fangrui Song 
> > Suggested-by: Jakub Jelinek 
> > Suggested-by: Masahiro Yamada 
> > Suggested-by: Nathan Chancellor 
> > Signed-off-by: Nick Desaulniers 
> > ---
> >  Makefile  |  1 +
> >  include/asm-generic/vmlinux.lds.h |  7 ++-
> >  lib/Kconfig.debug | 18 ++
> >  scripts/test_dwarf5_support.sh|  8 
> >  4 files changed, 33 insertions(+), 1 deletion(-)
> >  create mode 100755 scripts/test_dwarf5_support.sh
> >
> > diff --git a/Makefile b/Makefile
> > index d2b4980807e0..5387a6f2f62d 100644
> > --- a/Makefile
> > +++ b/Makefile
> > @@ -831,6 +831,7 @@ KBUILD_AFLAGS   += -Wa,-gdwarf-2
> >  endif
> >
> >  dwarf-version-$(CONFIG_DEBUG_INFO_DWARF4) := 4
> > +dwarf-version-$(CONFIG_DEBUG_INFO_DWARF5) := 5
> >  DEBUG_CFLAGS   += -gdwarf-$(dwarf-version-y)
> >
> >  ifdef CONFIG_DEBUG_INFO_REDUCED
> > diff --git a/include/asm-generic/vmlinux.lds.h 
> > b/include/asm-generic/vmlinux.lds.h
> > index 34b7e0d2346c..1e7cde4bd3f9 100644
> > --- a/include/asm-generic/vmlinux.lds.h
> > +++ b/include/asm-generic/vmlinux.lds.h
> > @@ -842,8 +842,13 @@
> > /* DWARF 4 */   \
> > .debug_types0 : { *(.debug_types) } \
> > /* DWARF 5 */   \
> > +   .debug_addr 0 : { *(.debug_addr) }  \
> > +   .debug_line_str 0 : { *(.debug_line_str) }  \
> > +   .debug_loclists 0 : { *(.debug_loclists) }  \
> > .debug_macro0 : { *(.debug_macro) } \
> > -   .debug_addr 0 : { *(.debug_addr) }
> > +   .debug_names0 : { *(.debug_names) } \
> > +   .debug_rnglists 0 : { *(.debug_rnglists) }  \
> > +   .debug_str_offsets  0 : { *(.debug_str_offsets) }
> >
>
> I just looked at binutils 2.36 in the Debian/experimental repositories.
>
> [1] says:
>
> + PR ld/27230
> + * scripttempl/DWARF.sc: Add DWARF-5 .debug_* sections.
>
> ...
>
> -  /* DWARF Extension.  */
> -  .debug_macro0 : { *(.debug_macro) }
> +  /* DWARF 5.  */
>.debug_addr 0 : { *(.debug_addr) }
> +  .debug_line_str 0 : { *(.debug_line_str) }
> +  .debug_loclists 0 : { *(.debug_loclists) }
> +  .debug_macro0 : { *(.debug_macro) }
> +  .debug_names0 : { *(.debug_names) }
> +  .debug_rnglists 0 : { *(.debug_rnglists) }
> +  .debug_str_offsets 0 : { *(.debug_str_offsets) }
> +  

Re: [PATCH v6 0/2] Kbuild: DWARF v5 support

2021-01-29 Thread Fāng-ruì Sòng
On Fri, Jan 29, 2021 at 4:46 PM 'Nick Desaulniers' via Clang Built
Linux  wrote:
>
> On Fri, Jan 29, 2021 at 4:08 PM Sedat Dilek  wrote:
> >
> > On Fri, Jan 29, 2021 at 8:43 PM Nick Desaulniers
> >  wrote:
> > >
> > > DWARF v5 is the latest standard of the DWARF debug info format.
> > >
> > > DWARF5 wins significantly in terms of size and especially so when mixed
> > > with compression (CONFIG_DEBUG_INFO_COMPRESSED).
> > >
> > > Link: http://www.dwarfstd.org/doc/DWARF5.pdf
> > >
> > > Patch 1 is a cleanup that lays the ground work and isn't DWARF
> > > v5 specific.
> > > Patch 2 implements Kconfig and Kbuild support for DWARFv5.
> > >
> >
> > When you will do a v7...
> >
> > Can you look also at places where we have hardcoded DWARF-2 handling...
>
> Ah, sorry, I just saw this now, after sending v7.  Can we wait to
> purge DWARF v2 until after we have DWARF v5?
>
> In fact, if they are orthogonal like I suspect, why don't you send
> some patches and I will help you test them?
> --
> Thanks,
> ~Nick Desaulniers

Basically the distinction is just between DWARF v2 .debug_line and
DWARF v5 .debug_line .
(-gdwarf-4 adds an extra maximum_operations_per_instruction (constant
1) compared with -gdwarf-2 but that can mostly be ignored).

Refinement among -gdwarf-[234] just clarifies things and is not going
to affect debugging experience in any case.
So I agree with Nick that it can be done separately.
Note: such clarification can make things a bit ugly because binutils
before 2020 does not recognize -gdwarf-[345].



-- 
宋方睿


Re: [PATCH] checkpatch: add warning for avoiding .L prefix symbols in assembly files

2021-01-22 Thread Fāng-ruì Sòng
On Fri, Jan 22, 2021 at 12:13 PM Aditya  wrote:
>
> On 23/1/21 12:40 am, Joe Perches wrote:
> > On Fri, 2021-01-22 at 18:48 +0530, Aditya wrote:
> >> On 21/1/21 12:13 am, Joe Perches wrote:
> >>> I believe the test should be:
> >>>
> >>> if ($realfile =~ /\.S$/ &&
> >>> $line =~ 
> >>> /^\+\s*SYM_[A-Z]+_(?:START|END)(?:_[A-Z_]+)?\s*\(\s*\.L/) {
> >>
> >> Joe, with this regex, we won't be able to match
> >> "jmp  .L_restore
> >> SYM_FUNC_END(\name)"
> >
> > I think that's not an issue.
> >
> >> which was replaced in this patch in the discussion:
> >> https://groups.google.com/g/clang-built-linux/c/-drkmLgu-cU/m/phKe-Tb6CgAJ
> >>
> >> Here, "jmp  .L_restore" was also replaced to fix the error.
> >
> > Notice that this line was also replaced in the same patch:
> >
> >  #ifdef CONFIG_PREEMPTION
> > -SYM_CODE_START_LOCAL_NOALIGN(.L_restore)
> > +SYM_CODE_START_LOCAL_NOALIGN(__thunk_restore)
> >
> >
> >> However, if this
> >> regex(/^\+\s*SYM_[A-Z]+_(?:START|END)(?:_[A-Z_]+)?\s*\(\s*\.L/) is
> >> correct, maybe we don't need to check for $file as we are now checking
> >> for just one line.
> >>
> >> What do you think?
> >
> > The test I wrote was complete, did not use $file and emits a
> > warning on the use of CODE_SYM_START_LOCAL_NOALIGN(.L_restore)
> >
> > I think it's sufficient.
> >
>
> Okay, Thanks.. I will send the modified patch :)
>
> Thanks
> Aditya

I am slightly concerned with the direction here.   jmp .Lsym is fine
and is probably preferable in cases where you absolutely want to emit
a local symbol.
(there is a related -z unique-symbol GNU ld+FGKASLR thing I shall
analyze in my spare time)

If the problem is just that STT_SECTION symbols are missing, can
objtool do something to conceptually synthesize them?


Re: [PATCH v3] x86: Treat R_386_PLT32 as R_386_PC32

2021-01-22 Thread Fāng-ruì Sòng
On Thu, Jan 14, 2021 at 2:48 PM Fangrui Song  wrote:
>
> This is similar to commit b21ebf2fb4cd ("x86: Treat R_X86_64_PLT32 as
> R_X86_64_PC32"), but for i386.  As far as Linux kernel is concerned,
> R_386_PLT32 can be treated the same as R_386_PC32.
>
> R_386_PC32/R_X86_64_PC32 are PC-relative relocation types with the
> requirement that the symbol address is significant.
> R_386_PLT32/R_X86_64_PLT32 are PC-relative relocation types without the
> address significance requirement.
>
> On x86-64, there is no PIC vs non-PIC PLT distinction and an
> R_X86_64_PLT32 relocation is produced for both `call/jmp foo` and
> `call/jmp foo@PLT` with newer (2018) GNU as/LLVM integrated assembler.
>
> On i386, there are 2 types of PLTs, PIC and non-PIC. Currently the
> convention is to use R_386_PC32 for non-PIC PLT and R_386_PLT32 for PIC
> PLT.
>
> clang-12 -fno-pic (since
> https://github.com/llvm/llvm-project/commit/a084c0388e2a59b9556f2de008232da3f1d6)
> can emit R_386_PLT32 for compiler generated function declarations as
> well to avoid a canonical PLT entry (st_shndx=0, st_value!=0) if the
> symbol turns out to be defined externally. GCC/GNU as will likely keep
> using R_386_PC32 because (1) the ABI is legacy (2) the change will drop
> a GNU ld non-default visibility ifunc for shared objects.
> https://sourceware.org/bugzilla/show_bug.cgi?id=27169
>
> Link: https://github.com/ClangBuiltLinux/linux/issues/1210
> Reported-by: Arnd Bergmann 
> Signed-off-by: Fangrui Song 
> Reviewed-by: Nick Desaulniers 
> Reviewed-by: Nathan Chancellor 
> Tested-by: Nick Desaulniers 
> Tested-by: Nathan Chancellor 
>
> ---
> Change in v2:
> * Improve commit message
> ---
> Change in v3:
> * Change the GCC link to the more relevant GNU as link.
> * Fix the relevant llvm-project commit id.
> ---
>  arch/x86/kernel/module.c | 1 +
>  arch/x86/tools/relocs.c  | 2 ++
>  2 files changed, 3 insertions(+)
>
> diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c
> index 34b153cbd4ac..5e9a34b5bd74 100644
> --- a/arch/x86/kernel/module.c
> +++ b/arch/x86/kernel/module.c
> @@ -114,6 +114,7 @@ int apply_relocate(Elf32_Shdr *sechdrs,
> *location += sym->st_value;
> break;
> case R_386_PC32:
> +   case R_386_PLT32:
> /* Add the value, subtract its position */
> *location += sym->st_value - (uint32_t)location;
> break;
> diff --git a/arch/x86/tools/relocs.c b/arch/x86/tools/relocs.c
> index ce7188cbdae5..717e48ca28b6 100644
> --- a/arch/x86/tools/relocs.c
> +++ b/arch/x86/tools/relocs.c
> @@ -867,6 +867,7 @@ static int do_reloc32(struct section *sec, Elf_Rel *rel, 
> Elf_Sym *sym,
> case R_386_PC32:
> case R_386_PC16:
> case R_386_PC8:
> +   case R_386_PLT32:
> /*
>  * NONE can be ignored and PC relative relocations don't
>  * need to be adjusted.
> @@ -910,6 +911,7 @@ static int do_reloc_real(struct section *sec, Elf_Rel 
> *rel, Elf_Sym *sym,
> case R_386_PC32:
> case R_386_PC16:
> case R_386_PC8:
> +   case R_386_PLT32:
> /*
>  * NONE can be ignored and PC relative relocations don't
>  * need to be adjusted.
> --
> 2.30.0.296.g2bfb1c46d8-goog
>

Ping:)


Re: [PATCH v2] module: Ignore _GLOBAL_OFFSET_TABLE_ when warning for undefined symbols

2021-01-15 Thread Fāng-ruì Sòng
On Thu, Jan 14, 2021 at 11:04 PM Marco Elver  wrote:
>
> On Thu, 14 Jan 2021 at 22:54, Fangrui Song  wrote:
> > clang-12 -fno-pic (since
> > https://github.com/llvm/llvm-project/commit/a084c0388e2a59b9556f2de008232da3f1d6)
> > can emit `call __stack_chk_fail@PLT` instead of `call __stack_chk_fail`
> > on x86.  The two forms should have identical behaviors on x86-64 but the
> > former causes GNU as<2.37 to produce an unreferenced undefined symbol
> > _GLOBAL_OFFSET_TABLE_.
> >
> > (On x86-32, there is an R_386_PC32 vs R_386_PLT32 difference but the
> > linker behavior is identical as far as Linux kernel is concerned.)
> >
> > Simply ignore _GLOBAL_OFFSET_TABLE_ for now, like what
> > scripts/mod/modpost.c:ignore_undef_symbol does. This also fixes the
> > problem for gcc/clang -fpie and -fpic, which may emit `call foo@PLT` for
> > external function calls on x86.
> >
> > Note: ld -z defs and dynamic loaders do not error for unreferenced
> > undefined symbols so the module loader is reading too much.  If we ever
> > need to ignore more symbols, the code should be refactored to ignore
> > unreferenced symbols.
> >
> > Reported-by: Marco Elver 
> > Link: https://github.com/ClangBuiltLinux/linux/issues/1250
> > Signed-off-by: Fangrui Song 
>
> Tested-by: Marco Elver 
>
> Thank you for the patch!
>
> > ---
> >  kernel/module.c | 20 ++--
> >  1 file changed, 18 insertions(+), 2 deletions(-)
> > ---
> > Changes in v2:
> > * Fix Marco's email address
> > * Add a function ignore_undef_symbol similar to 
> > scripts/mod/modpost.c:ignore_undef_symbol
> >
> > diff --git a/kernel/module.c b/kernel/module.c
> > index 4bf30e4b3eaa..278f5129bde2 100644
> > --- a/kernel/module.c
> > +++ b/kernel/module.c
> > @@ -2348,6 +2348,20 @@ static int verify_exported_symbols(struct module 
> > *mod)
> > return 0;
> >  }
> >
> > +static int ignore_undef_symbol(Elf_Half emachine, const char *name)
>
> Why not 'bool' return-type?

Will use bool and false in v3.

> > +{
> > +   /* On x86, PIC code and Clang non-PIC code may have call foo@PLT. 
> > GNU as
>
> Not sure if checkpatch.pl warns about this, but this multi-line
> comment does not follow the normal kernel-style (see elsewhere in
> file):

It doesn't warn about this. (The commit description warning cannot be
fixed, even if I place the closing paren on the next line.)

% ./scripts/checkpatch.pl
v2-0001-module-Ignore-_GLOBAL_OFFSET_TABLE_-when-warning-.patch
WARNING: Possible unwrapped commit description (prefer a maximum 75
chars per line)
#8:
https://github.com/llvm/llvm-project/commit/a084c0388e2a59b9556f2de008232da3f1d6)

total: 0 errors, 1 warnings, 32 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
  mechanically convert to the typical style using --fix or --fix-inplace.

v2-0001-module-Ignore-_GLOBAL_OFFSET_TABLE_-when-warning-.patch has
style problems, please review.

NOTE: If any of the errors are false positives, please report
  them to the maintainer, see CHECKPATCH in MAINTAINERS.

> /*
>  * ...
>  */
>
> > +* before 2.37 produces an unreferenced _GLOBAL_OFFSET_TABLE_ on 
> > x86-64.
> > +* i386 has a similar problem but may not deserve a fix.
> > +*
> > +* If we ever have to ignore many symbols, consider refactoring the 
> > code to
> > +* only warn if referenced by a relocation.
> > +*/
> > +   if (emachine == EM_386 || emachine == EM_X86_64)
> > +   return !strcmp(name, "_GLOBAL_OFFSET_TABLE_");
> > +   return 0;
> > +}
> > +
> >  /* Change all symbols so that st_value encodes the pointer directly. */
> >  static int simplify_symbols(struct module *mod, const struct load_info 
> > *info)
> >  {
> > @@ -2395,8 +2409,10 @@ static int simplify_symbols(struct module *mod, 
> > const struct load_info *info)
> > break;
> > }
> >
> > -   /* Ok if weak.  */
> > -   if (!ksym && ELF_ST_BIND(sym[i].st_info) == 
> > STB_WEAK)
> > +   /* Ok if weak or ignored.  */
> > +   if (!ksym &&
> > +   (ELF_ST_BIND(sym[i].st_info) == STB_WEAK ||
> > +ignore_undef_symbol(info->hdr->e_machine, 
> > name)))
> > break;
> >
> > ret = PTR_ERR(ksym) ?: -ENOENT;
> > --
> > 2.30.0.296.g2bfb1c46d8-goog
> >



-- 
宋方睿


Re: [PATCH] module: Ignore _GLOBAL_OFFSET_TABLE_ when warning for undefined symbols

2021-01-14 Thread Fāng-ruì Sòng
On Thu, Jan 14, 2021 at 6:06 AM Jessica Yu  wrote:
>
> +++ Fangrui Song [13/01/21 21:48 -0800]:
> >clang-12 -fno-pic (since
> >https://github.com/llvm/llvm-project/commit/a084c0388e2a59b9556f2de008232da3f1d6)
> >can emit `call __stack_chk_fail@PLT` instead of `call __stack_chk_fail`
> >on x86.  The two forms should have identical behaviors on x86-64 but the
> >former causes GNU as<2.37 to produce an unreferenced undefined symbol
> >_GLOBAL_OFFSET_TABLE_.
> >
> >(On x86-32, there is an R_386_PC32 vs R_386_PLT32 difference but the
> >linker behavior is identical as far as Linux kernel is concerned.)
> >
> >Simply ignore _GLOBAL_OFFSET_TABLE_ for now, like what
> >scripts/mod/modpost.c:ignore_undef_symbol does. This also fixes the
> >problem for gcc/clang -fpie and -fpic, which may emit `call foo@PLT` for
> >external function calls on x86.
> >
> >Note: ld -z defs and dynamic loaders do not error for unreferenced
> >undefined symbols so the module loader is reading too much.  If we ever
> >need to ignore more symbols, the code should be refactored to ignore
> >unreferenced symbols.
> >
> >Reported-by: Marco Elver 
> >Link: https://github.com/ClangBuiltLinux/linux/issues/1250
> >Signed-off-by: Fangrui Song 
> >---
> > kernel/module.c | 10 --
> > 1 file changed, 8 insertions(+), 2 deletions(-)
> >
> >diff --git a/kernel/module.c b/kernel/module.c
> >index 4bf30e4b3eaa..2e2deea99289 100644
> >--- a/kernel/module.c
> >+++ b/kernel/module.c
> >@@ -2395,8 +2395,14 @@ static int simplify_symbols(struct module *mod, const 
> >struct load_info *info)
> >   break;
> >   }
> >
> >-  /* Ok if weak.  */
> >-  if (!ksym && ELF_ST_BIND(sym[i].st_info) == STB_WEAK)
> >+  /* Ok if weak. Also allow _GLOBAL_OFFSET_TABLE_:
> >+   * GNU as before 2.37 produces an unreferenced 
> >_GLOBAL_OFFSET_TABLE_
> >+   * for call foo@PLT on x86-64.  If the code ever 
> >needs to ignore
> >+   * more symbols, refactor the code to only warn if 
> >referenced by
> >+   * a relocation.
> >+   */
> >+  if (!ksym && (ELF_ST_BIND(sym[i].st_info) == STB_WEAK 
> >||
> >+!strcmp(name, "_GLOBAL_OFFSET_TABLE_")))
> >   break;
>
> Hi Fangrui,
>
> Thanks for the patch. I am puzzled why we don't already mirror modpost
> here, that particular line of code in modpost to ignore _GLOBAL_OFFSET_TABLE_
> has been there long before my time. Let's properly mirror modpost
> then, and create a similar helper function ignore_undef_symbol() (and
> stick the _GLOBAL_OFFSET_TABLE_ check in there) to account for future
> cases like this.
>
> Thanks,
>
> Jessica

Hi Jessica,

I guess __this_module in scripts/mod/modpost.c:ignore_undef_symbol is
not a problem.
For PPC64 _restgpr0_* and _savegpr0_*, I am not sure ignoring the
undefined functions in kernel/module.c is right.
(I know they can be produced by gcc -Os in some cases
(https://reviews.llvm.org/D79977), but I want to learn whether that is
a real issue before adding them.)

If we ever need to ignore more symbols, the code should be refactored
to not warn for unreferenced undefined symbols as my description says.


Re: [PATCH v2] pgo: add clang's Profile Guided Optimization infrastructure

2021-01-12 Thread Fāng-ruì Sòng
On Tue, Jan 12, 2021 at 9:37 AM 'Nick Desaulniers' via Clang Built
Linux  wrote:
>
> On Mon, Jan 11, 2021 at 9:14 PM Bill Wendling  wrote:
> >
> > From: Sami Tolvanen 
> >
> > Enable the use of clang's Profile-Guided Optimization[1]. To generate a
> > profile, the kernel is instrumented with PGO counters, a representative
> > workload is run, and the raw profile data is collected from
> > /sys/kernel/debug/pgo/profraw.
> >
> > The raw profile data must be processed by clang's "llvm-profdata" tool
> > before it can be used during recompilation:
> >
> >   $ cp /sys/kernel/debug/pgo/profraw vmlinux.profraw
> >   $ llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw
> >
> > Multiple raw profiles may be merged during this step.
> >
> > The data can now be used by the compiler:
> >
> >   $ make LLVM=1 KCFLAGS=-fprofile-use=vmlinux.profdata ...
> >
> > This initial submission is restricted to x86, as that's the platform we
>
> Please drop all changes to arch/* that are not to arch/x86/ then; we
> can cross that bridge when we get to each arch. For example, there's
> no point disabling PGO for architectures LLVM doesn't even have a
> backend for.
>
> > know works. This restriction can be lifted once other platforms have
> > been verified to work with PGO.
> >
> > Note that this method of profiling the kernel is clang-native and isn't
> > compatible with clang's gcov support in kernel/gcov.
>
> Then the Kconfig option should depend on !GCOV so that they are
> mutually exclusive and can't be selected together accidentally; such
> as by bots doing randconfig tests.

The profile formats (Clang PGO, Clang gcov, GCC gcov/PGO) are
different but Clang PGO can be used with Clang's gcov implementation:
clang -fprofile-generate --coverage a.cc; ./a.out => default*.profraw + a.gcda

> 
>
> > +static inline int inst_prof_popcount(unsigned long long value)
> > +{
> > +   value = value - ((value >> 1) & 0xULL);
> > +   value = (value & 0xULL) +
> > +   ((value >> 2) & 0xULL);
> > +   value = (value + (value >> 4)) & 0x0F0F0F0F0F0F0F0FULL;
> > +
> > +   return (int)((unsigned long long)(value * 0x0101010101010101ULL) >> 
> > 56);
> > +}
>
> The kernel has a portable popcnt implementation called hweight64 if
> you #include ; does that work here?
> https://en.wikipedia.org/wiki/Hamming_weight
> --
> Thanks,
> ~Nick Desaulniers
>
> --
> You received this message because you are subscribed to the Google Groups 
> "Clang Built Linux" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to clang-built-linux+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/clang-built-linux/CAKwvOdk%2BNqhzC_4wFbQMJmLMQWoDSjQiRJyCGe5dsWkqK_NJJQ%40mail.gmail.com.


Re: [PATCH] pgo: add clang's Profile Guided Optimization infrastructure

2021-01-11 Thread Fāng-ruì Sòng
On Mon, Jan 11, 2021 at 4:38 PM Bill Wendling  wrote:
>
> On Mon, Jan 11, 2021 at 12:31 PM Fangrui Song  wrote:
> > On 2021-01-11, Bill Wendling wrote:
> > >On Mon, Jan 11, 2021 at 12:12 PM Fangrui Song  wrote:
> > >>
> > >> On 2021-01-11, 'Bill Wendling' via Clang Built Linux wrote:
> > >> >From: Sami Tolvanen 
> > >> >
> > >> >Enable the use of clang's Profile-Guided Optimization[1]. To generate a
> > >> >profile, the kernel is instrumented with PGO counters, a representative
> > >> >workload is run, and the raw profile data is collected from
> > >> >/sys/kernel/debug/pgo/profraw.
> > >> >
> > >> >The raw profile data must be processed by clang's "llvm-profdata" tool 
> > >> >before
> > >> >it can be used during recompilation:
> > >> >
> > >> >  $ cp /sys/kernel/debug/pgo/profraw vmlinux.profraw
> > >> >  $ llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw
> > >> >
> > >> >Multiple raw profiles may be merged during this step.
> > >> >
> > >> >The data can be used either by the compiler if LTO isn't enabled:
> > >> >
> > >> >... -fprofile-use=vmlinux.profdata ...
> > >> >
> > >> >or by LLD if LTO is enabled:
> > >> >
> > >> >... -lto-cs-profile-file=vmlinux.profdata ...
> > >>
> > >> This LLD option does not exist.
> > >> LLD does have some `--lto-*` options but the `-lto-*` form is not 
> > >> supported
> > >> (it clashes with -l) https://reviews.llvm.org/D79371
> > >>
> > >That's strange. I've been using that option for years now. :-) Is this
> > >a recent change?
> >
> > The more frequently used options (specifyed by the clang driver) are
> > -plugin-opt=... (options implemented by LLVMgold.so).
> > `-lto-*` is rare.
> >
> > >> (There is an earlier -fprofile-instr-generate which does
> > >> instrumentation in Clang, but the option does not have broad usage.
> > >> It is used more for code coverage, not for optimization.
> > >> Noticeably, it does not even implement the Kirchhoff's current law
> > >> optimization)
> > >>
> > >Right. I've been told outside of this email that -fprofile-generate is
> > >the prefered flag to use.
> > >
> > >> -fprofile-use= is used by both regular PGO and context-sensitive PGO 
> > >> (CSPGO).
> > >>
> > >> clang -flto=thin -fprofile-use= passes -plugin-opt=cs-profile-path= to 
> > >> the linker.
> > >> For regular PGO, this option is effectively a no-op (confirmed with 
> > >> CSPGO main developer).
> > >>
> > >> So I think the "or by LLD if LTO is enabled:" part should be removed.
> > >
> > >But what if you specify the linking step explicitly? Linux doesn't
> > >call "clang" when linking, but "ld.lld".
> >
> > Regular PGO+LTO does not need -plugin-opt=cs-profile-path=
> > CSPGO+LTO needs it.
> > Because -fprofile-use= may be used by both, Clang driver adds it.
> > CSPGO is relevant in this this patch, so the linker option does not need to 
> > be mentioned.
>
> I'm still a bit confused. Are you saying that when clang uses
> `-flto=thin -fprofile-use=foo` that the profile file "foo" is embedded
> into the bitcode file so that when the linker's run it'll be used?
>
> This is the workflow:
>
> clang ... -fprofile-use=vmlinux.profdata ... -c -o foo.o foo.c
> clang ... -fprofile-use=vmlinux.profdata ... -c -o bar.o bar.c
> ld.lld ...  foo.o bar.o
>
> Are you saying that we don't need to have
> "-plugin-opt=cs-profile-path=vmlinux.profdata" on the "ld.lld ..."
> line?
>
> -bw

The backend compile step -flto=thin -fprofile-use=foo has all the information.

-plugin-opt=cs-profile-path=vmlinux.profdata is not needed for regular PGO.


Re: [PATCH v3] x86/entry: emit a symbol for register restoring thunk

2021-01-11 Thread Fāng-ruì Sòng
On Mon, Jan 11, 2021 at 4:38 PM Borislav Petkov  wrote:
>
> On Mon, Jan 11, 2021 at 12:38:06PM -0800, Nick Desaulniers wrote:
> > Arnd found a randconfig that produces the warning:
> >
> > arch/x86/entry/thunk_64.o: warning: objtool: missing symbol for insn at
> > offset 0x3e
> >
> > when building with LLVM_IAS=1 (use Clang's integrated assembler). Josh
> > notes:
> >
> >   With the LLVM assembler stripping the .text section symbol, objtool
> >   has no way to reference this code when it generates ORC unwinder
> >   entries, because this code is outside of any ELF function.
> >
> > Fangrui notes that this optimization is helpful for reducing images size
> > when compiling with -ffunction-sections and -fdata-sections. I have
> > observerd on the order of tens of thousands of symbols for the kernel
> > images built with those flags. A patch has been authored against GNU
> > binutils to match this behavior, with a new flag
> > --generate-unused-section-symbols=[yes|no].
> >
> > We can omit the .L prefix on a label to emit an entry into the symbol
> > table for the label, with STB_LOCAL binding.  This enables objtool to
> > generate proper unwind info here with LLVM_IAS=1.
> >
> > Cc: Fangrui Song 
> > Link: https://github.com/ClangBuiltLinux/linux/issues/1209
> > Link: https://reviews.llvm.org/D93783
> > Link: https://sourceware.org/binutils/docs/as/Symbol-Names.html
> > Link: https://sourceware.org/pipermail/binutils/2020-December/114671.html
> > Reported-by: Arnd Bergmann 
> > Suggested-by: Josh Poimboeuf 
> > Signed-off-by: Nick Desaulniers 
> > ---
> > Changes v2 -> v3:
> > * rework to use STB_LOCAL rather than STB_GLOBAL by dropping .L prefix,
> >   as per Josh.
>
> Ok so I read a bit around those links above...
>
> Are you trying to tell me here that we can't use .L-prefixed local
> labels anymore because, well, clang's assembler is way too overzealous
> when stripping symbols to save whopping KiBs of memory?!

To be fair: we cannot use .L-prefixed local because of the objtool limitation.
The LLVM integrated assembler behavior is a good one and binutils
global maintainers have agreed so H.J. went ahead and implemented it
for GNU as x86.

> Btw Josh made sense to me when asking for a flag or so to keep .text.
>
> And I see --generate-unused-section-symbols=[yes|no] for binutils.
>
> So why isn't there a patch using that switch on clang too instead of the
> kernel having to dance yet again for some tool?

--generate-unused-section-symbols=[yes|no] as an assembler option has
been rejected.


> :-\
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette


Re: [PATCH v3] x86/entry: emit a symbol for register restoring thunk

2021-01-11 Thread Fāng-ruì Sòng
On Mon, Jan 11, 2021 at 2:09 PM Josh Poimboeuf  wrote:
>
> On Mon, Jan 11, 2021 at 12:58:14PM -0800, Fangrui Song wrote:
> > On 2021-01-11, Nick Desaulniers wrote:
> > > Arnd found a randconfig that produces the warning:
> > >
> > > arch/x86/entry/thunk_64.o: warning: objtool: missing symbol for insn at
> > > offset 0x3e
> > >
> > > when building with LLVM_IAS=1 (use Clang's integrated assembler). Josh
> > > notes:
> > >
> > >  With the LLVM assembler stripping the .text section symbol, objtool
> > >  has no way to reference this code when it generates ORC unwinder
> > >  entries, because this code is outside of any ELF function.
> > >
> > > Fangrui notes that this optimization is helpful for reducing images size
> > > when compiling with -ffunction-sections and -fdata-sections. I have
> > > observerd on the order of tens of thousands of symbols for the kernel
> > > images built with those flags. A patch has been authored against GNU
> > > binutils to match this behavior, with a new flag
> > > --generate-unused-section-symbols=[yes|no].
> >
> > https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=d1bcae833b32f1408485ce69f844dcd7ded093a8
> > has been committed. The patch should be included in binutils 2.37.
> > The maintainers are welcome to the idea, but fixing all the arch-specific 
> > tests is tricky.
> >
> > H.J. fixed the x86 tests and enabled this for x86. When binutils 2.37
> > come out, some other architectures may follow as well.
> >
> > > We can omit the .L prefix on a label to emit an entry into the symbol
> > > table for the label, with STB_LOCAL binding.  This enables objtool to
> > > generate proper unwind info here with LLVM_IAS=1.
> >
> > Josh, I think objtool orc generate needs to synthesize STT_SECTION
> > symbols even if they do not exist in object files.
>
> I'm guessing you don't mean re-adding *all* missing STT_SECTIONs, as
> that would just be undoing these new assembler features.
>
> We could re-add STT_SECTION only when there's no other corresponding
> symbol associated with the code, but then objtool would have to start
> updating the symbol table (which right now it manages to completely
> avoid).  But that would only be for the niche cases, like
> 'SYM_CODE.*\.L' as you mentioned.
>
> I'd rather avoid making doing something so pervasive for such a small
> number of edge cases.  It's hopefully easier and more robust to just say
> "all code must be associated with a symbol".  I suspect we're already
> ~99.99% there anyway.
>
>   $ git grep -e 'SYM_CODE.*\.L'
>   arch/x86/entry/entry_64.S:SYM_CODE_START_LOCAL_NOALIGN(.Lbad_gs)
>   arch/x86/entry/entry_64.S:SYM_CODE_END(.Lbad_gs)
>   arch/x86/entry/thunk_64.S:SYM_CODE_START_LOCAL_NOALIGN(.L_restore)
>   arch/x86/entry/thunk_64.S:SYM_CODE_END(.L_restore)
>   arch/x86/lib/copy_user_64.S:SYM_CODE_START_LOCAL(.Lcopy_user_handle_tail)
>   arch/x86/lib/copy_user_64.S:SYM_CODE_END(.Lcopy_user_handle_tail)
>   arch/x86/lib/getuser.S:SYM_CODE_START_LOCAL(.Lbad_get_user_clac)
>   arch/x86/lib/getuser.S:SYM_CODE_END(.Lbad_get_user_clac)
>   arch/x86/lib/getuser.S:SYM_CODE_START_LOCAL(.Lbad_get_user_8_clac)
>   arch/x86/lib/getuser.S:SYM_CODE_END(.Lbad_get_user_8_clac)
>   arch/x86/lib/putuser.S:SYM_CODE_START_LOCAL(.Lbad_put_user_clac)
>   arch/x86/lib/putuser.S:SYM_CODE_END(.Lbad_put_user_clac)

I'd prefer that the assembly can continue using .L and does not know
the objtool limitation.
Assemblers normally drop .L symbols. These symbols are otherwise not useful.

However, if as you said, teaching objtool about synthesizing
STT_SECTION from section header table is difficult,
this patch looks fine to me.

Reviewed-by: Fangrui Song 

> Alternatively, the assemblers could add an option to only strip
> -ffunction-sections and -fdata-sections STT_SECTION symbols, e.g. leave
> ".text" and friends alone.

I forgot to mention that --generate-unused-section-symbols=[yes|no] is
not added to GNU as.
Making the assembler behavior dependent on -ffunction-sections is not
an option in both LLVM integrated assembler and GNU as.


Re: [PATCH v2 mips-next 2/4] MIPS: vmlinux.lds.S: add ".gnu.attributes" to DISCARDS

2021-01-06 Thread Fāng-ruì Sòng
On Wed, Jan 6, 2021 at 2:07 PM Kees Cook  wrote:
>
> On Wed, Jan 06, 2021 at 08:08:19PM +, Alexander Lobakin wrote:
> > Discard GNU attributes at link time as kernel doesn't use it at all.
> > Solves a dozen of the following ld warnings (one per every file):
> >
> > mips-alpine-linux-musl-ld: warning: orphan section `.gnu.attributes'
> > from `arch/mips/kernel/head.o' being placed in section
> > `.gnu.attributes'
> > mips-alpine-linux-musl-ld: warning: orphan section `.gnu.attributes'
> > from `init/main.o' being placed in section `.gnu.attributes'
> >
> > Misc: sort DISCARDS section entries alphabetically.
>
> Hmm, I wonder what is causing the appearance of .eh_frame? With help I
> tracked down all the causes of this on x86, arm, and arm64, so that's
> why it's not in the asm-generic DISCARDS section. I suspect this could
> be cleaned up for mips too?

On x86, 003602ad5516e59940de42e44c8d8033387bb363 "x86/*/Makefile: Use
-fno-asynchronous-unwind-tables to suppress .eh_frame sections"
noticed that some Makefiles
redefined KBUILD_CFLAGS and dropped -fno-asynchronous-unwind-tables.
Maybe mips has similar issues.

> Similarly for .gnu.attributes. What is generating that? (Or, more
> specifically, why is it both being generated AND discarded?)
>
> -Kees

gcc/config/mips/mips.c
https://github.com/gcc-mirror/gcc/blob/master/gcc/config/mips/mips.c#L9965
.gnu_attribute 4, 0 does not produce .gnu.attributes
(SHT_GNU_ATTRIBUTES) but there are likely code paths that
a non-zero value is used... So .gnu_attributes is likely needed to be excluded.

> >
> > Signed-off-by: Alexander Lobakin 
> > ---
> >  arch/mips/kernel/vmlinux.lds.S | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/mips/kernel/vmlinux.lds.S b/arch/mips/kernel/vmlinux.lds.S
> > index 83e27a181206..5d6563970ab2 100644
> > --- a/arch/mips/kernel/vmlinux.lds.S
> > +++ b/arch/mips/kernel/vmlinux.lds.S
> > @@ -221,9 +221,10 @@ SECTIONS
> >   /* ABI crap starts here */
> >   *(.MIPS.abiflags)
> >   *(.MIPS.options)
> > + *(.eh_frame)
> > + *(.gnu.attributes)
> >   *(.options)
> >   *(.pdr)
> >   *(.reginfo)
> > - *(.eh_frame)
> >   }
> >  }
> > --
> > 2.30.0
> >
> >
>
> --
> Kees Cook



-- 
宋方睿


Re: [PATCH v2 mips-next 3/4] MIPS: vmlinux.lds.S: catch bad .got, .plt and .rel.dyn at link time

2021-01-06 Thread Fāng-ruì Sòng
On Wed, Jan 6, 2021 at 12:08 PM Alexander Lobakin  wrote:
>
> Catch any symbols placed in .got, .got.plt, .plt, .rel.dyn
> or .rela.dyn and check for these sections to be zero-sized
> at link time.
>
> At least two of them were noticed in real builds:
>
> mips-alpine-linux-musl-ld: warning: orphan section `.rel.dyn'
> from `init/main.o' being placed in section `.rel.dyn'
>
> ld.lld: warning: :(.got) is being placed in '.got'
>
> Adopted from x86/kernel/vmlinux.lds.S.
>
> Reported-by: Nathan Chancellor  # .got
> Suggested-by: Fangrui Song  # .rel.dyn
> Signed-off-by: Alexander Lobakin 
> ---
>  arch/mips/kernel/vmlinux.lds.S | 35 ++
>  1 file changed, 35 insertions(+)
>
> diff --git a/arch/mips/kernel/vmlinux.lds.S b/arch/mips/kernel/vmlinux.lds.S
> index 5d6563970ab2..05eda9d9a7d5 100644
> --- a/arch/mips/kernel/vmlinux.lds.S
> +++ b/arch/mips/kernel/vmlinux.lds.S
> @@ -227,4 +227,39 @@ SECTIONS
> *(.pdr)
> *(.reginfo)
> }
> +
> +   /*
> +* Sections that should stay zero sized, which is safer to
> +* explicitly check instead of blindly discarding.
> +*/
> +
> +   .got : {
> +   *(.got)
> +   *(.igot.*)
> +   }
> +   ASSERT(SIZEOF(.got) == 0, "Unexpected GOT entries detected!")
> +
> +   .got.plt (INFO) : {
> +   *(.got.plt)
> +   }
> +   ASSERT(SIZEOF(.got.plt) == 0, "Unexpected GOT/PLT entries detected!")

(INFO) drops the SHF_ALLOC flag from the output section (It does not
mean "informational"). INFO is not need here.
The diff from 815d680771ae09080d2da83dac2647c08cdf99ce "x86/build:
Enforce an empty .got.plt section" is not needed.

> +   .plt : {
> +   *(.plt)
> +   *(.plt.*)
> +   *(.iplt)
> +   }
> +   ASSERT(SIZEOF(.plt) == 0, "Unexpected run-time procedure linkages 
> detected!")
> +
> +   .rel.dyn : {
> +   *(.rel.*)
> +   *(.rel_*)
> +   }
> +   ASSERT(SIZEOF(.rel.dyn) == 0, "Unexpected run-time relocations (.rel) 
> detected!")
> +
> +   .rela.dyn : {
> +   *(.rela.*)
> +   *(.rela_*)
> +   }
> +   ASSERT(SIZEOF(.rela.dyn) == 0, "Unexpected run-time relocations 
> (.rela) detected!")
>  }

x86 has both .rel.dyn and .rela.dyn because i386 psABI uses REL while
x86-64 psABI uses RELA, but mips does not need .rela.dyn

> --
> 2.30.0
>
>


Re: [PATCH mips-next 2/4] MIPS: vmlinux.lds.S: add ".rel.dyn" to DISCARDS

2021-01-04 Thread Fāng-ruì Sòng
On Mon, Jan 4, 2021 at 4:21 AM Alexander Lobakin  wrote:
>
> GCC somehow manages to place some of the symbols from main.c into
> .rel.dyn section:
>
> mips-alpine-linux-musl-ld: warning: orphan section `.rel.dyn'
> from `init/main.o' being placed in section `.rel.dyn'
>
> I couldn't catch up the exact symbol, but seems like it's harmless
> to discard it from the final vmlinux as kernel doesn't use or
> support dynamic relocations.
>
> Misc: sort DISCARDS section entries alphabetically.
>
> Signed-off-by: Alexander Lobakin 
> ---
>  arch/mips/kernel/vmlinux.lds.S | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/arch/mips/kernel/vmlinux.lds.S b/arch/mips/kernel/vmlinux.lds.S
> index 83e27a181206..1c3c2e903062 100644
> --- a/arch/mips/kernel/vmlinux.lds.S
> +++ b/arch/mips/kernel/vmlinux.lds.S
> @@ -221,9 +221,10 @@ SECTIONS
> /* ABI crap starts here */
> *(.MIPS.abiflags)
> *(.MIPS.options)
> +   *(.eh_frame)
> *(.options)
> *(.pdr)
> *(.reginfo)
> -   *(.eh_frame)
> +   *(.rel.dyn)
> }
>  }
> --
> 2.30.0
>
>

(I don't know why I am on the CC list since I know little about
mips... Anyway, I know the LLD linker's behavior in case that was the
intention... )

I think it'd be good to know the reason why these dynamic relocations
are produced and fix the root cause.

arch/x86/kernel/vmlinux.lds.S asserts no dynamic relocation:
ASSERT(SIZEOF(.rela.dyn) == 0, "Unexpected run-time relocations
(.rela) detected!")


Re: [PATCH v8 00/16] Add support for Clang LTO

2020-12-08 Thread Fāng-ruì Sòng
On Tue, Dec 8, 2020 at 1:02 PM Arnd Bergmann  wrote:
>
> On Tue, Dec 8, 2020 at 9:59 PM Arnd Bergmann  wrote:
> >
> > Attaching the config for "ld.lld: error: Never resolved function from
> >   blockaddress (Producer: 'LLVM12.0.0' Reader: 'LLVM 12.0.0')"
>
> And here is a new one: "ld.lld: error: assignment to symbol
> init_pg_end does not converge"
>
>   Arnd
>

This is interesting. I changed the symbol assignment to a separate
loop in https://reviews.llvm.org/D66279
Does raising the limit help? Sometimes the kernel linker script can be
rewritten to be more friendly to the linker...


Re: [PATCH v2 3/4] Kbuild: make DWARF version a choice

2020-12-01 Thread Fāng-ruì Sòng

On 2020-12-01, Segher Boessenkool wrote:

On Tue, Dec 01, 2020 at 12:38:16PM +0900, Masahiro Yamada wrote:

> We can bump -Wa,-gdwarf-2 to -Wa,-gdwarf-3 since GNU actually emits
> DWARF v3 DW_AT_ranges (see
> https://sourceware.org/bugzilla/show_bug.cgi?id=26850 )
> This can avoid the `warning: DWARF2 only supports one section per
> compilation unit` warning for Clang.


That warning should be "there can be only one section with executable
code per translation unit", or similar.


I am not a DWARF spec expert.


Neither am I.


Please teach me.

In my understanding, "DWARF2 only supports one section ..."
is warned only when building .S files with LLVM_IAS=1


.S files are simply run through the C preprocessor first, and then given
to the assembler.  The only difference there should be wrt debug info is
you could have some macros that expand to assembler debug statements.


If this is due to the limitation of DWARF v2, why is it OK to
build .c files with LLVM_IAS?


The compiler can of course make sure not to use certain constructs in
its generated assembler code, while the assembler will have to swallow
whatever the user wrote.



These are all correct. You can use `llvm-dwarfdump a.o` to dump a .o file.
It has one DW_TAG_compile_unit. If the translation unit has a single
contiguous address range, the assembler can emit a pair of
DW_AT_low_pc/DW_AT_high_pc (available in DWARF v2). In the case of
multiple executable sections, it is not guaranteed that in the final
linked image the sections will be contiguous, so the assembler has to
assume there may be non-contiguous address ranges and use DW_AT_ranges.

Unfortunately DW_AT_ranges was introduced in DWARF v3 and technically
not available in DWARF v2. But GNU as ignores this and emits
DW_AT_ranges anyway (this is probably fine - like using a GNU extension).

If -Wa,-gdwarf-2 -> -Wa,-gdwarf-3 can eliminate the LLVM integrated
assembler's warning, we should do it. If people think -Wa,-gdwarf-2 is
not useful and want to delete it, I'll be happier. Whether it is
necessary to use -Wa,-gdwarf-2/-Wa,-gdwarf-5? Personally I would think
this is unnecessary, but I won't mind if people don't mind the
additional complexity in Makefile. (I implemented the -gdwarf-5 address
range stuff for the integrated assembler).


Re: [PATCH v2 3/4] Kbuild: make DWARF version a choice

2020-11-30 Thread Fāng-ruì Sòng
On Mon, Nov 30, 2020 at 10:05 AM Masahiro Yamada  wrote:
>
> On Wed, Nov 4, 2020 at 9:53 AM 'Nick Desaulniers' via Clang Built
> Linux  wrote:
> >
> > Modifies CONFIG_DEBUG_INFO_DWARF4 to be a member of a choice. Adds an
> > explicit CONFIG_DEBUG_INFO_DWARF2, which is the default. Does so in a
> > way that's forward compatible with existing configs, and makes adding
> > future versions more straightforward.
> >
> > Suggested-by: Fangrui Song 
> > Suggested-by: Masahiro Yamada 
> > Signed-off-by: Nick Desaulniers 
> > ---
> >  Makefile  | 14 --
> >  lib/Kconfig.debug | 19 +++
> >  2 files changed, 23 insertions(+), 10 deletions(-)
> >
> > diff --git a/Makefile b/Makefile
> > index 75b1a3dcbf30..e23786a4c1c7 100644
> > --- a/Makefile
> > +++ b/Makefile
> > @@ -826,12 +826,14 @@ else
> >  DEBUG_CFLAGS   += -g
> >  endif
> >
> > -ifndef LLVM_IAS
> > -KBUILD_AFLAGS  += -Wa,-gdwarf-2
> > -endif
> > -
> > -ifdef CONFIG_DEBUG_INFO_DWARF4
> > -DEBUG_CFLAGS   += -gdwarf-4
> > +dwarf-version-$(CONFIG_DEBUG_INFO_DWARF2) := 2
> > +dwarf-version-$(CONFIG_DEBUG_INFO_DWARF4) := 4
> > +DEBUG_CFLAGS   += -gdwarf-$(dwarf-version-y)
> > +ifneq ($(dwarf-version-y)$(LLVM_IAS),21)
> > +# Binutils 2.35+ required for -gdwarf-4+ support.
> > +dwarf-aflag:= $(call as-option,-Wa$(comma)-gdwarf-$(dwarf-version-y))
> > +DEBUG_CFLAGS   += $(dwarf-aflag)
>
> This changes the behavior.
>
> For the Dwarf-2 case,
>
> Previously, -gdwarf-2 was passed to $(CC),
> so the debug info was generated by gcc.
>
> Now, -Wa,-gdwarf-2 is passed to $(CC).
> -gdwarf-2 is handled by GNU as.
> So, the source info points to /tmp/.s
> instead of the original .c file.
>
>
>
> Handling the Dwarf capability is very complicated.
>
> Are you still working for v3?
>
>
>
> > +KBUILD_AFLAGS  += $(dwarf-aflag)
> >  endif
> >
> >  ifdef CONFIG_DEBUG_INFO_REDUCED
> > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> > index 83a860126897..03c494eefabd 100644
> > --- a/lib/Kconfig.debug
> > +++ b/lib/Kconfig.debug
> > @@ -256,14 +256,25 @@ config DEBUG_INFO_SPLIT
> >   to know about the .dwo files and include them.
> >   Incompatible with older versions of ccache.
> >
> > +choice
> > +   prompt "DWARF version"
> > +   help
> > + Which version of DWARF debug info to emit.
> > +
> > +config DEBUG_INFO_DWARF2
> > +   bool "Generate DWARF v2 debuginfo"
> > +   help
> > + Generate DWARF v2 debug info.
> > +
> >  config DEBUG_INFO_DWARF4
> > bool "Generate dwarf4 debuginfo"
> > depends on $(cc-option,-gdwarf-4)
> > help
> > - Generate dwarf4 debug info. This requires recent versions
> > - of gcc and gdb. It makes the debug information larger.
> > - But it significantly improves the success of resolving
> > - variables in gdb on optimized code.
> > + Generate DWARF v4 debug info. This requires gcc 4.5+ and gdb 7.0+.
> > + It makes the debug information larger, but it significantly
> > + improves the success of resolving variables in gdb on optimized 
> > code.
> > +
> > +endchoice # "DWARF version"
> >
> >  config DEBUG_INFO_BTF
> > bool "Generate BTF typeinfo"
> > --
> > 2.29.1.341.ge80a0c044ae-goog


We can bump -Wa,-gdwarf-2 to -Wa,-gdwarf-3 since GNU actually emits
DWARF v3 DW_AT_ranges (see
https://sourceware.org/bugzilla/show_bug.cgi?id=26850 )
This can avoid the `warning: DWARF2 only supports one section per
compilation unit` warning for Clang.

Deleting -Wa,-gdwarf-2 also sounds good to me if people can verify
their debugging experience is not regressed (I believe it is useless).


-- 
宋方睿


Re: [PATCH v15 01/26] Documentation/x86: Add CET description

2020-11-30 Thread Fāng-ruì Sòng
On Mon, Nov 30, 2020 at 10:34 AM Yu, Yu-cheng  wrote:
>
> On 11/30/2020 10:26 AM, Nick Desaulniers wrote:
> > (In response to 
> > https://lore.kernel.org/lkml/20201110162211.9207-2-yu-cheng...@intel.com/)
> >
> >> These need to be enabled to build a CET-enabled kernel, and Binutils v2.31
> >> and GCC v8.1 or later are required to build a CET kernel.
> >
> > What about LLVM? Surely CrOS might be of interest to ship this on (we ship 
> > the
> > equivalent for aarch64 on Android).
> >
>
> I have not built with LLVM, but think it probably will work as well.  I
> will test it.
>
> >> An application's CET capability is marked in its ELF header and can be
> >> verified from the following command output, in the NT_GNU_PROPERTY_TYPE_0
> >> field:
> >>
> >>  readelf -n  | grep SHSTK
> >>  properties: x86 feature: IBT, SHSTK
> >
> > Same for llvm-readelf.
> >
>
> I will add that to the document.
>
> Thanks,
> Yu-cheng

The baseline LLVM version is 10.0.1, which is good enough for  clang
-fcf-protection=full, llvm-readelf -n, LLD's .note.gnu.property
handling (the LLD option is `-z force-ibt`, though)


-- 
宋方睿


Re: Error: invalid switch -me200

2020-11-13 Thread Fāng-ruì Sòng
On Fri, Nov 13, 2020 at 4:23 PM Segher Boessenkool
 wrote:
>
> On Fri, Nov 13, 2020 at 12:14:18PM -0800, Nick Desaulniers wrote:
> > > > > Error: invalid switch -me200
> > > > > Error: unrecognized option -me200
> > > >
> > > > 251 cpu-as-$(CONFIG_E200)   += -Wa,-me200
> > > >
> > > > Are those all broken configs, or is Kconfig messed up such that
> > > > randconfig can select these when it should not?
> > >
> > > Hmmm, looks like this flag does not exist in mainline binutils? There is
> > > a thread in 2010 about this that Segher commented on:
> > >
> > > https://lore.kernel.org/linuxppc-dev/9859e645-954d-4d07-8003-ffcd2391a...@kernel.crashing.org/
> > >
> > > Guess this config should be eliminated?
>
> The help text for this config options says that e200 is used in 55xx,
> and there *is* an -me5500 GAS flag (which probably does this same
> thing, too).  But is any of this tested, or useful, or wanted?
>
> Maybe Christophe knows, cc:ed.
>
>
> Segher

CC Alan Modra, a binutils global maintainer.

Alan, can the few -Wa,-m* options deleted from arch/powerpc/Makefile ?
The topic started at
http://lore.kernel.org/r/202011131146.g8dplqdd-...@intel.com and
people would like to get rid of some options (if possible).


Re: Error: invalid switch -me200

2020-11-13 Thread Fāng-ruì Sòng
On Thu, Nov 12, 2020 at 7:22 PM kernel test robot  wrote:
>
> Hi Fangrui,
>
> FYI, the error/warning still remains.
>
> tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
> master
> head:   585e5b17b92dead8a3aca4e3c9876fbca5f7e0ba
> commit: ca9b31f6bb9c6aa9b4e5f0792f39a97bbffb8c51 Makefile: Fix 
> GCC_TOOLCHAIN_DIR prefix for Clang cross compilation
> date:   4 months ago
> config: powerpc-randconfig-r031-20201113 (attached as .config)
> compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project 
> 9e0c35655b6e8186baef8840b26ba4090503b554)
> reproduce (this is a W=1 build):
> wget 
> https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
> ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # install powerpc cross compiling tool for clang build
> # apt-get install binutils-powerpc-linux-gnu
> # 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ca9b31f6bb9c6aa9b4e5f0792f39a97bbffb8c51
> git remote add linus 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> git fetch --no-tags linus master
> git checkout ca9b31f6bb9c6aa9b4e5f0792f39a97bbffb8c51
> # save the attached .config to linux build tree
> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross 
> ARCH=powerpc
>
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot 
>
> All errors (new ones prefixed by >>):
>
>Assembler messages:
> >> Error: invalid switch -me200
> >> Error: unrecognized option -me200
>clang-12: error: assembler command failed with exit code 1 (use -v to see 
> invocation)
>make[2]: *** [scripts/Makefile.build:281: scripts/mod/empty.o] Error 1
>make[2]: Target '__build' not remade because of errors.
>make[1]: *** [Makefile:1174: prepare0] Error 2
>make[1]: Target 'prepare' not remade because of errors.
>make: *** [Makefile:185: __sub-make] Error 2
>make: Target 'prepare' not remade because of errors.
>
> ---
> 0-DAY CI Kernel Test Service, Intel Corporation
> https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org

This can be ignored. The LLVM integrated assembler does not recognize
-me200 (-Wa,-me200 in arch/powerpc/Makefile). I guess the GNU as -m
option is similar to .arch or .machine and controls what instructions
are recognized. The integrated assembler tends to support all
instructions (conditional supporting some instructions has some
challenges; in the end I have patched parsing but ignoring `.arch` for
x86-64 and ignoring `.machine ppc64` for ppc64)

(In addition, e200 is a 32-bit Power ISA microprocessor. 32-bit
support may get less attention in LLVM.)


Re: [PATCH v2] vmlinux.lds.h: Keep .ctors.* with .ctors

2020-10-21 Thread Fāng-ruì Sòng
On Wed, Oct 21, 2020 at 1:09 PM Kees Cook  wrote:
>
> On Wed, Oct 14, 2020 at 09:53:39PM -0700, Fāng-ruì Sòng wrote:
> > On Wed, Oct 14, 2020 at 4:04 PM Kees Cook  wrote:
> > > > index 5430febd34be..b83c00c63997 100644
> > > > --- a/include/asm-generic/vmlinux.lds.h
> > > > +++ b/include/asm-generic/vmlinux.lds.h
> > > > @@ -684,6 +684,7 @@
> > > >  #ifdef CONFIG_CONSTRUCTORS
> > > >  #define KERNEL_CTORS()   . = ALIGN(8);  \
> > > >   __ctors_start = .; \
> > > > + KEEP(*(SORT(.ctors.*)))\
> > > >   KEEP(*(.ctors))\
> > > >   KEEP(*(SORT(.init_array.*)))   \
> > > >   KEEP(*(.init_array))   \
> > > > --
> > > > 2.25.1
> >
> > I think it would be great to figure out why these .ctors.* .dtors.* are 
> > generated.
>
> I haven't had the time to investigate. This patch keeps sfr's builds
> from regressing, so we need at least this first.

We need to know under what circumstances .ctors.* are generated.
For Clang>=10.0.1, for all *-linux triples, .init_array/.finit_array
are used by default.
There is a toggle -fno-use-init-array (not in GCC) to switch back to
.ctors/.dtors

Modern GCC also uses .init_array. The minimum requirement is now GCC
4.9 and thus I wonder whether the .ctors configuration is still
supported.
If it is (maybe because glibc version which is not specified on
https://www.kernel.org/doc/html/latest/process/changes.html ), we
should use
some #if to highlight that.

> > ~GCC 4.7 switched to default to .init_array/.fini_array if libc
> > supports it. I have some refactoring in this area of Clang as well
> > (e.g. https://reviews.llvm.org/D71393)
> >
> > And I am not sure SORT(.init_array.*) or SORT(.ctors.*) will work. The
> > correct construct is SORT_BY_INIT_PRIORITY(.init_array.*)
>
> The kernel doesn't seem to use the init_priority attribute at all. Are
> you saying the cause of the .ctors.* names are a result of some internal
> use of init_priority by the compiler here?
>

If no priority is intended, consider deleting SORT to avoid confusion?


Re: [PATCH v2] vmlinux.lds.h: Keep .ctors.* with .ctors

2020-10-14 Thread Fāng-ruì Sòng
On Wed, Oct 14, 2020 at 4:04 PM Kees Cook  wrote:
>
> On Sun, Oct 04, 2020 at 07:57:20PM -0700, Kees Cook wrote:
> > Under some circumstances, the compiler generates .ctors.* sections. This
> > is seen doing a cross compile of x86_64 from a powerpc64el host:
> >
> > x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from 
> > `kernel/trace/trace_clock.o' being
> > placed in section `.ctors.65435'
> > x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from 
> > `kernel/trace/ftrace.o' being
> > placed in section `.ctors.65435'
> > x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from 
> > `kernel/trace/ring_buffer.o' being
> > placed in section `.ctors.65435'
> >
> > Include these orphans along with the regular .ctors section.
> >
> > Reported-by: Stephen Rothwell 
> > Tested-by: Stephen Rothwell 
> > Fixes: 83109d5d5fba ("x86/build: Warn on orphan section placement")
> > Signed-off-by: Kees Cook 
>
> Ping -- please take this for tip/urgent, otherwise we're drowning sfr in
> warnings. :)
>
> -Kees
>
> > ---
> > v2: brown paper bag version: fix whitespace for proper backslash alignment
> > ---
> >  include/asm-generic/vmlinux.lds.h | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/include/asm-generic/vmlinux.lds.h 
> > b/include/asm-generic/vmlinux.lds.h
> > index 5430febd34be..b83c00c63997 100644
> > --- a/include/asm-generic/vmlinux.lds.h
> > +++ b/include/asm-generic/vmlinux.lds.h
> > @@ -684,6 +684,7 @@
> >  #ifdef CONFIG_CONSTRUCTORS
> >  #define KERNEL_CTORS()   . = ALIGN(8);  \
> >   __ctors_start = .; \
> > + KEEP(*(SORT(.ctors.*)))\
> >   KEEP(*(.ctors))\
> >   KEEP(*(SORT(.init_array.*)))   \
> >   KEEP(*(.init_array))   \
> > --
> > 2.25.1
> >
>
> --
> Kees Cook
>
> --
> You received this message because you are subscribed to the Google Groups 
> "Clang Built Linux" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to clang-built-linux+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/clang-built-linux/202010141603.49EA0CE%40keescook.

I think it would be great to figure out why these .ctors.* .dtors.*
are generated.
~GCC 4.7 switched to default to .init_array/.fini_array if libc
supports it. I have some refactoring in this area of Clang as well
(e.g. https://reviews.llvm.org/D71393)

And I am not sure SORT(.init_array.*) or SORT(.ctors.*) will work. The
correct construct is SORT_BY_INIT_PRIORITY(.init_array.*)


Re: [PATCH] arm/build: Always handle .ARM.exidx and .ARM.extab sections

2020-10-12 Thread Fāng-ruì Sòng
On Mon, Oct 12, 2020 at 2:11 PM 'Nick Desaulniers' via Clang Built
Linux  wrote:
>
> On Mon, Sep 28, 2020 at 3:49 PM Nathan Chancellor
>  wrote:
> >
> > After turning on warnings for orphan section placement, enabling
> > CONFIG_UNWINDER_FRAME_POINTER instead of CONFIG_UNWINDER_ARM causes
> > thousands of warnings when clang + ld.lld are used:
> >
> > $ scripts/config --file arch/arm/configs/multi_v7_defconfig \
> >  -d CONFIG_UNWINDER_ARM \
> >  -e CONFIG_UNWINDER_FRAME_POINTER
> > $ make -skj"$(nproc)" ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- LLVM=1 
> > defconfig zImage
> > ld.lld: warning: init/built-in.a(main.o):(.ARM.extab) is being placed in 
> > '.ARM.extab'
> > ld.lld: warning: init/built-in.a(main.o):(.ARM.extab.init.text) is being 
> > placed in '.ARM.extab.init.text'
> > ld.lld: warning: init/built-in.a(main.o):(.ARM.extab.ref.text) is being 
> > placed in '.ARM.extab.ref.text'
> > ld.lld: warning: init/built-in.a(do_mounts.o):(.ARM.extab.init.text) is 
> > being placed in '.ARM.extab.init.text'
> > ld.lld: warning: init/built-in.a(do_mounts.o):(.ARM.extab) is being placed 
> > in '.ARM.extab'
> > ld.lld: warning: init/built-in.a(do_mounts_rd.o):(.ARM.extab.init.text) is 
> > being placed in '.ARM.extab.init.text'
> > ld.lld: warning: init/built-in.a(do_mounts_rd.o):(.ARM.extab) is being 
> > placed in '.ARM.extab'
> > ld.lld: warning: init/built-in.a(do_mounts_initrd.o):(.ARM.extab.init.text) 
> > is being placed in '.ARM.extab.init.text'
> > ld.lld: warning: init/built-in.a(initramfs.o):(.ARM.extab.init.text) is 
> > being placed in '.ARM.extab.init.text'
> > ld.lld: warning: init/built-in.a(initramfs.o):(.ARM.extab) is being placed 
> > in '.ARM.extab'
> > ld.lld: warning: init/built-in.a(calibrate.o):(.ARM.extab.init.text) is 
> > being placed in '.ARM.extab.init.text'
> > ld.lld: warning: init/built-in.a(calibrate.o):(.ARM.extab) is being placed 
> > in '.ARM.extab'
> >
> > These sections are handled by the ARM_UNWIND_SECTIONS define, which is
> > only added to the list of sections when CONFIG_ARM_UNWIND is set.
> > CONFIG_ARM_UNWIND is a hidden symbol that is only selected when
> > CONFIG_UNWINDER_ARM is set so CONFIG_UNWINDER_FRAME_POINTER never
> > handles these sections. According to the help text of
> > CONFIG_UNWINDER_ARM, these sections should be discarded so that the
> > kernel image size is not affected.
>
> My apologies for taking so long to review this.
>
> I have a suspicion that these come from forcing on configs that
> Kconfig/menuconfig would block, and aren't clang or lld specific, yet
> are exposed by the new linker warnings for orphan section placement
> (good).  That said, we definitely have OEMs in Android land that still
> prefer the older unwinder.
>
> From https://developer.arm.com/documentation/ihi0038/b/ (click
> download in top left), section 4.4.1 "Sections" has a note:
>
> ```
> Tables are not required for ABI compliance at the C/Assembler level
> but are required for C++.
> ```
>
> Review-by: Nick Desaulniers 
> Tested-by: Nick Desaulniers 
>
> Please submit to:
> https://www.arm.linux.org.uk/developer/patches/add.php
>
> >
> > Fixes: 5a17850e251a ("arm/build: Warn on orphan section placement")
> > Link: https://github.com/ClangBuiltLinux/linux/issues/1152
> > Reported-by: kernel test robot 
> > Signed-off-by: Nathan Chancellor 
> > ---
> >  arch/arm/kernel/vmlinux.lds.S | 4 
> >  1 file changed, 4 insertions(+)
> >
> > diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
> > index 5f4922e858d0..a2c0d96b0580 100644
> > --- a/arch/arm/kernel/vmlinux.lds.S
> > +++ b/arch/arm/kernel/vmlinux.lds.S
> > @@ -40,6 +40,10 @@ SECTIONS
> > ARM_DISCARD
> >  #ifndef CONFIG_SMP_ON_UP
> > *(.alt.smp.init)
> > +#endif
> > +#ifndef CONFIG_ARM_UNWIND
> > +   *(.ARM.exidx*)
>
> I don't think we need the wildcard, as without this line, I see:
>
> ld.lld: warning: :(.ARM.exidx) is being placed in '.ARM.exidx'

We may need the wildcard if there are -ffunction-sections builds.
In clang, .ARM.exidx* cannot be removed even with -fno-unwind-tables
-fno-exceptions.

> though I do see binutils linker scripts use precisely what you have.
> So I guess that's fine.
>
> I guess we can't reuse `ARM_UNWIND_SECTIONS` since the ALIGN and
> linker-script-defined-symbols would be weird in a DISCARD clause?
>
>
> > +   *(.ARM.extab*)
> >  #endif
> > }
> >
> >
> > base-commit: 6e0bf0e0e55000742a53c5f3b58f8669e0091a11
> > --
>
>
> --
> Thanks,
> ~Nick Desaulniers
>
> --
> You received this message because you are subscribed to the Google Groups 
> "Clang Built Linux" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to clang-built-linux+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/clang-built-linux/CAKwvOd%3D%2B98r6F4JjrPEoWX88WQ%3DB-KMRP2eWojabLk6it3i5KA%40mail.gmail.com.



Re: [PATCH v5 13/36] vmlinux.lds.h: add PGO and AutoFDO input sections

2020-08-03 Thread Fāng-ruì Sòng

On 2020-08-03, Andi Kleen wrote:

Why is that? Both .text and .text.hot have alignment of 2^4 (default
function alignment on x86) by default, so it doesn't seem like it should
matter for packing density.  Avoiding interspersing cold text among


You may lose part of a cache line on each unit boundary. Linux has
a lot of units, some of them small. All these bytes add up.

It's bad for TLB locality too. Sadly with all the fine grained protection
changes the 2MB coverage is eroding anyways, but this makes it even worse.



Gives worse packing for the hot part
if they are not aligned to 64byte boundaries, which they are usually
not.


I do not see how the 64-byte argument is related to this patch.
If a function requires 64-byte alignment to be efficient, the compiler
should communicate this fact by setting the alignment of its containing
section to 64 bytes or above.

If a text section has a 16-byte alignment, the linker can reorder it to
an address which is a multiple of 16 but not a multiple of 64.

I agree with your other statement that having a single input section
description might be helpful. With more than one input section
descrition, the linker has to respect the ordering requirement. With
just one input section description, the linker has more freedom ordering
sections if profitable. For example, LLD performs two ordering
heuristics as my previous reply mentions.

It'd be good if someone can measure the benefit. Personally I don't
think this kind of ordering has significant benefit. (For
arm/aarch64/powerpc there might be some size benefit due to fewer range
extension thunks)


regular/hot text seems like it should be a net win.




That old commit doesn't reference efficiency -- it says there was some
problem with matching when they were separated out, but there were no
wildcard section names back then.


It was about efficiency.

-Andi

--
You received this message because you are subscribed to the Google Groups "Clang 
Built Linux" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clang-built-linux+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/clang-built-linux/20200804044532.GC1321588%40tassilo.jf.intel.com.


Re: [PATCH v5 13/36] vmlinux.lds.h: add PGO and AutoFDO input sections

2020-08-03 Thread Fāng-ruì Sòng

On 2020-08-03, Arvind Sankar wrote:

On Mon, Aug 03, 2020 at 12:05:06PM -0700, Andi Kleen wrote:

> However, the history of their being together comes from
>
>   9bebe9e5b0f3 ("kbuild: Fix .text.unlikely placement")
>
> which seems to indicate there was some problem with having them separated out,
> although I don't quite understand what the issue was from the commit message.

Separating it out is less efficient. Gives worse packing for the hot part
if they are not aligned to 64byte boundaries, which they are usually not.

It also improves packing of the cold part, but that probably doesn't matter.

-Andi


Why is that? Both .text and .text.hot have alignment of 2^4 (default
function alignment on x86) by default, so it doesn't seem like it should
matter for packing density.  Avoiding interspersing cold text among
regular/hot text seems like it should be a net win.

That old commit doesn't reference efficiency -- it says there was some
problem with matching when they were separated out, but there were no
wildcard section names back then.


I just want to share some context. GNU ld's internal linker script does
impose a particular input section order by specifying separate input section 
descriptions:

  .text   :
  {
*(.text.unlikely .text.*_unlikely .text.unlikely.*)
*(.text.exit .text.exit.*)
*(.text.startup .text.startup.*)
*(.text.hot .text.hot.*)
*(SORT(.text.sorted.*))   # binutils 
5fa5f8f5fe494ba4fe98c11899a5464cd164ec75, invented for GCC's call graph 
profiling. LLVM doesn't use it
*(.text .stub .text.* .gnu.linkonce.t.*)
...

This order is a bit arbitrary. gold and LLD have -z keep-text-section-prefix. 
With the option,
there can be several output sections, with the 
'.unlikely'/'.exit'/'.startup'/etc suffix.
This has the advantage that the hot/unlikely/exit/etc attribution of a 
particular function is more obvious:

  [ 2] .text PROGBITS0040007c 7c 03 00  AX  
0   0  4
  [ 3] .text.startup PROGBITS0040007f 7f 01 00  AX  
0   0  1
  [ 4] .text.exitPROGBITS00400080 80 02 00  AX  
0   0  1
  [ 5] .text.unlikelyPROGBITS00400082 82 01 00  AX  0   0  1 
  ...


In our case we only need one output section...

If we place all text sections in one input section description:

*(.text.unlikely .text.*_unlikely .text.exit .text.exit.* .text.startup 
.text.startup.* .text.hot .text.hot.* ... )

In many cases the input sections are laid out in the input order. In LLD there 
are two ordering cases:

* If clang PGO (-fprofile-use=) is enabled, .llvm.call-graph-profile will be 
created automatically.
  LLD can perform reordering **within an input section description**. The 
ordering is quite complex,
  you can read 
https://github.com/llvm/llvm-project/blob/master/lld/ELF/CallGraphSort.cpp#L9 
if you are curious:)

  I don't know the performance improvement of this heuristic. (I don't think 
the original paper
  cgo2017-hfsort-final1.pdf took ThinLTO into account, so the result might not
  reflect realistic work loads where both ThinLTO and PGO are used) This, if 
matters, likely only matters
  for very large executable, not the case for the kernel.
* On some RISC architectures (ARM/AArch64/PowerPC),
  the ordered sections (due to either .llvm.call-graph-profile or
  --symbol-reordering-file=; the two can't be used together) are placed in a
  suitable place in the input section description
  ( http://reviews.llvm.org/D44969 )


In summary, using one (large) input section description may have some
performance improvement with LLD but I don't think it will be significant.
There may be some size improvement for ARM/AArch64/PowerPC if someone wants to 
test.


commit 9bebe9e5b0f3109a14000df25308c2971f872605
Author: Andi Kleen 
Date:   Sun Jul 19 18:01:19 2015 -0700

   kbuild: Fix .text.unlikely placement

   When building a kernel with .text.unlikely text the unlikely text for
   each translation unit was put next to the main .text code in the
   final vmlinux.

   The problem is that the linker doesn't allow more specific submatches
   of a section name in a different linker script statement after the
   main match.

   So we need to move them all into one line. With that change
   .text.unlikely is at the end of everything again.

   I also moved .text.hot into the same statement though, even though
   that's not strictly needed.

   Signed-off-by: Andi Kleen 
   Signed-off-by: Michal Marek 

diff --git a/include/asm-generic/vmlinux.lds.h 
b/include/asm-generic/vmlinux.lds.h
index 8bd374d3cf21..1781e54ea6d3 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -412,12 +412,10 @@
 * during second ld run in second ld pass when generating System.map */
#define TEXT_TEXT   \
ALIGN_FUNCTION();   \
-   

Re: [PATCH] vmlinux.lds: consider .text.{hot|unlikely}.* part of .text too

2020-06-22 Thread Fāng-ruì Sòng

On 2020-06-22, Nick Desaulniers wrote:

On Wed, Jun 17, 2020 at 2:27 PM Fāng-ruì Sòng  wrote:



On 2020-06-17, Nick Desaulniers wrote:
>ld.bfd's internal linker script considers .text.hot AND .text.hot.* to
>be part of .text, as well as .text.unlikely and .text.unlikely.*.

>ld.lld will produce .text.hot.*/.text.unlikely.* sections.

Correction to this sentence. lld is not relevant here.

-ffunction-sections combined with profile-guided optimization can
produce .text.hot.* .text.unlikely.* sections.  Newer clang may produce
.text.hot. .text.unlikely. (without suffix, but with a trailing dot)
when -fno-unique-section-names is specified, as an optimization to make
.strtab smaller.


Then why was the bug report reporting https://reviews.llvm.org/D79600
as the result of a bisection, if LLD is not relevant?  Was the
bisection wrong?


https://reviews.llvm.org/D79600 is an LLVM codegen change, unrelated to LLD..
(As described in the patch, LLD's -z keep-text-section-prefix only
recognizes ".text.exit.*", not ".text.exit")


The upstream report wasn't initially public, for no good reason.  So I
didn't include it, but if we end up taking v1, this should have

Link: https://bugs.chromium.org/p/chromium/issues/detail?id=1084760


Thanks for making it public.


The kernel doesn't use -fno-unique-section-names; is that another flag
that's added by CrOS' compiler wrapper?
https://source.chromium.org/chromiumos/chromiumos/codesearch/+/master:src/third_party/toolchain-utils/compiler_wrapper/config.go;l=110
Looks like no.  It doesn't use `-fno-unique-section-names` or
`-ffunction-sections`.


-fno-unique-section-names is a very rare option. It is not supported by GCC 
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95095 ).
clang users use it very rarely, probably because not many people care
about additional strings taken by section names ".text.hot.a" ".text.hot.b" 
".text.hot.c"
in the string table ".strtab" (clang since some point of 2018 uses
.strtab instead of .shstrtab which enables more string sharing).










We've already seen that GCC can place main in .text.startup without
-ffunction-sections. There may be other non -ffunction-sections cases
for .text.hot.* or .text.unlikely.*. So it is definitely a good idea to
be more specific even if we don't care about -ffunction-sections for
now.

>Make sure to group these together.  Otherwise these orphan sections may
>be placed outside of the the _stext/_etext boundaries.
>
>Cc: sta...@vger.kernel.org
>Link: 
https://sourceware.org/git/?p=binutils-gdb.git;a=commitdiff;h=add44f8d5c5c05e08b11e033127a744d61c26aee
>Link: 
https://sourceware.org/git/?p=binutils-gdb.git;a=commitdiff;h=1de778ed23ce7492c523d5850c6c6dbb34152655
>Link: https://reviews.llvm.org/D79600
>Reported-by: Jian Cai 
>Debugged-by: Luis Lozano 
>Suggested-by: Fāng-ruì Sòng 
>Tested-by: Luis Lozano 
>Tested-by: Manoj Gupta 
>Signed-off-by: Nick Desaulniers 
>---
> include/asm-generic/vmlinux.lds.h | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
>diff --git a/include/asm-generic/vmlinux.lds.h 
b/include/asm-generic/vmlinux.lds.h
>index d7c7c7f36c4a..fe5aaef169e3 100644
>--- a/include/asm-generic/vmlinux.lds.h
>+++ b/include/asm-generic/vmlinux.lds.h
>@@ -560,7 +560,9 @@
>  */
> #define TEXT_TEXT \
>   ALIGN_FUNCTION();   \
>-  *(.text.hot TEXT_MAIN .text.fixup .text.unlikely)   \
>+  *(.text.hot .text.hot.*)\
>+  *(TEXT_MAIN .text.fixup)\
>+  *(.text.unlikely .text.unlikely.*)  \
>   NOINSTR_TEXT\
>   *(.text..refcount)  \
>   *(.ref.text)\
>--
>2.27.0.290.gba653c62da-goog
>




--
Thanks,
~Nick Desaulniers


Re: [PATCH v2 1/3] vmlinux.lds.h: Add .gnu.version* to DISCARDS

2020-06-22 Thread Fāng-ruì Sòng
On Mon, Jun 22, 2020 at 3:57 PM Kees Cook  wrote:
>
> On Mon, Jun 22, 2020 at 03:52:37PM -0700, Fangrui Song wrote:
> > > And it's not in the output:
> > >
> > > $ readelf -Vs arch/x86/boot/compressed/vmlinux | grep version
> > > No version information found in this file.
> > >
> > > So... for the kernel we need to silence it right now.
> >
> > Re-link with -M (or -Map file) to check where .gnu.version{,_d,_r} input
> > sections come from?
>
> It's not reporting it correctly:
>
> .gnu.version_d  0x008966b00x0
>  .gnu.version_d
> 0x008966b00x0 
> arch/x86/boot/compressed/kernel_info.o
>
> .gnu.version0x008966b00x0
>  .gnu.version   0x008966b00x0 
> arch/x86/boot/compressed/kernel_info.o
>
> .gnu.version_r  0x008966b00x0
>  .gnu.version_r
> 0x008966b00x0 
> arch/x86/boot/compressed/kernel_info.o
>
> it just reports whatever file is listed on the link command line first.
>
> > If it is a bug, we should probably figure out which version of binutils
> > has fixed the bug.
>
> I see this with binutils 2.34...
>
> --
> Kees Cook

:( It deserves a binutils bug
(https://sourceware.org/bugzilla/enter_bug.cgi?product=binutils ) and
a comment..

With the description adjusted to say that this works around a bug

Reviewed-by: Fangrui Song 


Re: [PATCH] vmlinux.lds: consider .text.{hot|unlikely}.* part of .text too

2020-06-17 Thread Fāng-ruì Sòng



On 2020-06-17, Nick Desaulniers wrote:

ld.bfd's internal linker script considers .text.hot AND .text.hot.* to
be part of .text, as well as .text.unlikely and .text.unlikely.*.



ld.lld will produce .text.hot.*/.text.unlikely.* sections.


Correction to this sentence. lld is not relevant here.

-ffunction-sections combined with profile-guided optimization can
produce .text.hot.* .text.unlikely.* sections.  Newer clang may produce
.text.hot. .text.unlikely. (without suffix, but with a trailing dot)
when -fno-unique-section-names is specified, as an optimization to make
.strtab smaller.

We've already seen that GCC can place main in .text.startup without
-ffunction-sections. There may be other non -ffunction-sections cases
for .text.hot.* or .text.unlikely.*. So it is definitely a good idea to
be more specific even if we don't care about -ffunction-sections for
now.


Make sure to group these together.  Otherwise these orphan sections may
be placed outside of the the _stext/_etext boundaries.

Cc: sta...@vger.kernel.org
Link: 
https://sourceware.org/git/?p=binutils-gdb.git;a=commitdiff;h=add44f8d5c5c05e08b11e033127a744d61c26aee
Link: 
https://sourceware.org/git/?p=binutils-gdb.git;a=commitdiff;h=1de778ed23ce7492c523d5850c6c6dbb34152655
Link: https://reviews.llvm.org/D79600
Reported-by: Jian Cai 
Debugged-by: Luis Lozano 
Suggested-by: Fāng-ruì Sòng 
Tested-by: Luis Lozano 
Tested-by: Manoj Gupta 
Signed-off-by: Nick Desaulniers 
---
include/asm-generic/vmlinux.lds.h | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/asm-generic/vmlinux.lds.h 
b/include/asm-generic/vmlinux.lds.h
index d7c7c7f36c4a..fe5aaef169e3 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -560,7 +560,9 @@
 */
#define TEXT_TEXT   \
ALIGN_FUNCTION();   \
-   *(.text.hot TEXT_MAIN .text.fixup .text.unlikely)   \
+   *(.text.hot .text.hot.*)\
+   *(TEXT_MAIN .text.fixup)\
+   *(.text.unlikely .text.unlikely.*)  \
NOINSTR_TEXT\
*(.text..refcount)  \
*(.ref.text)\
--
2.27.0.290.gba653c62da-goog