On Thu, 21 Jan 2021 at 21:35, Daniel Engel <lib...@danielengel.com> wrote:
>
> Hi Christophe,
>
> On Thu, Jan 21, 2021, at 2:29 AM, Christophe Lyon wrote:
> > On Sat, 16 Jan 2021 at 17:13, Daniel Engel <lib...@danielengel.com> wrote:
> > >
> > > Hi Christophe,
> > >
> > > On Fri, Jan 15, 2021, at 4:30 AM, Christophe Lyon wrote:
> > > > On Fri, 15 Jan 2021 at 12:39, Daniel Engel <lib...@danielengel.com> 
> > > > wrote:
> > > > >
> > > > > Hi Christophe,
> > > > >
> > > > > On Mon, Jan 11, 2021, at 8:39 AM, Christophe Lyon wrote:
> > > > > > On Mon, 11 Jan 2021 at 17:18, Daniel Engel <lib...@danielengel.com> 
> > > > > > wrote:
> > > > > > >
> > > > > > > On Mon, Jan 11, 2021, at 8:07 AM, Christophe Lyon wrote:
> > > > > > > > On Sat, 9 Jan 2021 at 14:09, Christophe Lyon 
> > > > > > > > <christophe.l...@linaro.org> wrote:
> > > > > > > > >
> > > > > > > > > On Sat, 9 Jan 2021 at 13:27, Daniel Engel 
> > > > > > > > > <lib...@danielengel.com> wrote:
> > > > > > > > > >
> > > > > > > > > > On Thu, Jan 7, 2021, at 4:56 AM, Richard Earnshaw wrote:
> > > > > > > > > > > On 07/01/2021 00:59, Daniel Engel wrote:
> > > > > > > > > > > > --snip--
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, Jan 6, 2021, at 9:05 AM, Richard Earnshaw wrote:
> > > > > > > > > > > > --snip--
> > > > > > > > > > > >
> > > > > > > > > > > >> - finally, your popcount implementations have data in 
> > > > > > > > > > > >> the code segment.
> > > > > > > > > > > >>  That's going to cause problems when we have 
> > > > > > > > > > > >> compilation options such as
> > > > > > > > > > > >> -mpure-code.
> > > > > > > > > > > >
> > > > > > > > > > > > I am just following the precedent of existing lib1funcs 
> > > > > > > > > > > > (e.g. __clz2si).
> > > > > > > > > > > > If this matters, you'll need to point in the right 
> > > > > > > > > > > > direction for the
> > > > > > > > > > > > fix.  I'm not sure it does matter, since these 
> > > > > > > > > > > > functions are PIC anyway.
> > > > > > > > > > >
> > > > > > > > > > > That might be a bug in the clz implementations - 
> > > > > > > > > > > Christophe: Any thoughts?
> > > > > > > > > >
> > > > > > > > > > __clzsi2() has test coverage in 
> > > > > > > > > > "gcc.c-torture/execute/builtin-bitops-1.c"
> > > > > > > > > Thanks, I'll have a closer look at why I didn't see problems.
> > > > > > > > >
> > > > > > > >
> > > > > > > > So, that's because the code goes to the .text section (as 
> > > > > > > > opposed to
> > > > > > > > .text.noread)
> > > > > > > > and does not have the PURECODE flag. The compiler takes care of 
> > > > > > > > this
> > > > > > > > when generating code with -mpure-code.
> > > > > > > > And the simulator does not complain because it only checks 
> > > > > > > > loads from
> > > > > > > > the segment with the PURECODE flag set.
> > > > > > > >
> > > > > > > This is far out of my depth, but can something like:
> > > > > > >
> > > > > > > ifeq (,$(findstring __symbian__,$(shell $(gcc_compile_bare) -dM 
> > > > > > > -E - </dev/null)))
> > > > > > >
> > > > > > > be adapted to:
> > > > > > >
> > > > > > > a) detect the state of the -mpure-code switch, and
> > > > > > > b) pass that flag to the preprocessor?
> > > > > > >
> > > > > > > If so, I can probably fix both the target section and the data 
> > > > > > > usage.
> > > > > > > Just have to add a few instructions to finish unrolling the loop.
> > > > > >
> > > > > > I must confess I never checked libgcc's Makefile deeply before,
> > > > > > but it looks like you can probably detect whether -mpure-code is
> > > > > > part of $CFLAGS.
> > > > > >
> > > > > > However, it might be better to write pure-code-safe code
> > > > > > unconditionally because the toolchain will probably not
> > > > > > be rebuilt with -mpure-code as discussed before.
> > > > > > Or that could mean adding a -mpure-code multilib....
> > > > >
> > > > > I have learned a few things since the last update.  I think I know how
> > > > > to get -mpure-code out of CFLAGS and into a macro.  However, I have 
> > > > > hit
> > > > > something of a wall with testing.  I can't seem to compile any flavor 
> > > > > of
> > > > > libgcc with CFLAGS_FOR_TARGET="-mpure-code".
> > > > >
> > > > > 1.  Configuring --with-multilib-list=rmprofile results in build 
> > > > > failure:
> > > > >
> > > > >     checking for suffix of object files... configure: error: in 
> > > > > `/home/mirdan/gcc-obj/arm-none-eabi/libgcc':
> > > > >     configure: error: cannot compute suffix of object files: cannot 
> > > > > compile
> > > > >     See `config.log' for more details
> > > > >
> > > > >    cc1: error: -mpure-code only supports non-pic code on M-profile 
> > > > > targets
> > > > >
> > > >
> > > > Yes, I did hit that wall too :-)
> > > >
> > > > Hence what we discussed earlier: the toolchain is not rebuilt with 
> > > > -mpure-code.
> > > >
> > > > Note that there are problems in newlib too, but users of -mpure-code 
> > > > seem
> > > > to be able to work around that (eg. using their own startup code and no 
> > > > stdlib)
> > >
> > > Is there a current side project to solve the makefile problems?
> > None that I know of.
> >
> >
> > > I think I'm back to my original question: If libgcc can't be built
> > > with -mpure-code, and users bypass it completely with -nostdlib, then
> > > why this conversation about pure-code compatibility of __clzsi2() etc?
> > I think Richard noticed this pre-existing problem as part of the review
> > of your patches. I don't think I meant fixing this is a prerequisite.
> > But maybe I misunderstood :-)
>
> I might have misunderstood too then.  It was certainly a pre-existing
> problem, but I took the comments to mean that I had to own it as part of
> touching those functions.
>
> > > > > 2.  Attempting to filter the multib list results in configuration 
> > > > > error.
> > > > >     This might have been misguided, but it was something I tried:
> > > > >
> > > > >     Error: --with-multilib-list=armv6s-m not supported.
> > > > >
> > > > >     Error: --with-multilib-list=mthumb/march=armv6s-m/mfloat-abi=soft 
> > > > > not supported
> > > >
> > > > I think only 2 values are supported: aprofile and rmprofile.
> > >
> > > It looks like this might require a custom t-* multilib in gcc/config/arm.
> > Or we could add -mpure-code to the rmprofile list.
>
> I have no strong opinions here.  Are you proposing that the "m" versions
> of libgcc be built with -mpure-code enabled by default, or are you
> proposing a parallel set of multilibs?  -mpure-code by default would
> have costs in both size and speed.
I'm not sure how large the penalty is for thumb-2 cores?
Maybe it's acceptable to build thumb-2 with -mpure-code by default,
but probably not for cortex-m0.

> > > > > 3.  Attempting to configure a single architecture results in a build 
> > > > > error.
> > > > >
> > > > >     --with-mode=thumb --with-arch=armv6s-m --with-float=soft
> > > > >
> > > > >     checking for suffix of object files... configure: error: in 
> > > > > `/home/mirdan/gcc-obj/arm-none-eabi/arm/autofp/v5te/fpu/libgcc':
> > > > >     configure: error: cannot compute suffix of object files: cannot 
> > > > > compile
> > > > >     See `config.log' for more details
> > > > >
> > > > >     conftest.c:9:10: fatal error: ac_nonexistent.h: No such file or 
> > > > > directory
> > > > >         9 | #include <ac_nonexistent.h>
> > > > >           |          ^~~~~~~~~~~~~~~~~~
> > > > I never saw that error message, but I never build using --with-arch.
> > > > I do use --with-cpu though.
> > > >
> > > > > This has me wondering whether pure-code in libgcc is a real issue ...
> > > > > If there's a way to build libgcc with -mpure-code, please enlighten 
> > > > > me.
> > > > I haven't done so yet. Maybe building the toolchain --with-cpu=cortex-m0
> > > > works?
> > >
> > > No luck with that.  Same error message as before:
> > >
> > > 4.  --with-mode=thumb --with-arch=armv6s-m --with-float=soft 
> > > --with-cpu=cortex-m0
> > >
> > >     Switch "--with-arch" may not be used with switch "--with-cpu"
> > >
> > > 5.  Then: --with-mode=thumb --with-float=soft --with-cpu=cortex-m0
> > >
> > >     checking for suffix of object files... configure: error: in 
> > > `/home/mirdan/gcc-obj/arm-none-eabi/arm/autofp/v5te/fpu/libgcc':
> > >     configure: error: cannot compute suffix of object files: cannot 
> > > compile
> > >     See `config.log' for more details
> > >
> > >     cc1: error: -mpure-code only supports non-pic code on M-profile 
> > > targets
> > Yes that's because default multilibs include targets incompatible with
> > -mpure-code
> >
> > > 6.  Finally! --with-float=soft --with-cpu=cortex-m0 --disable-multilib
> > >
> > > Once you know this, and read the docs sideways, the previous errors are
> > > all probably "works as designed".  But, I can still grumble.
> > Yes, I think it's "as designed". I faced the "incompatible multilibs" issue
> > too some time ago. Hence testing is not easy.
> >
> > > With libgcc compiled with -mpure-code, I can confirm that
> > > 'builtin-bitops-1.c' (the test for __clzsi2) passed with libgcc as-is.
> > >
> > > I then added the SHF_ARM_PURECODE flag to the libgcc assembly functions
> > > and re-ran the test.  Still passed.  I then added -mpure-code to
> > > RUNTESTFLAGS and re-ran the test.  Still passed.  readelf confirmed that
> > > the test program is compiling as expected [1]:
> > >
> > >     [ 2] .text             PROGBITS        0000800c 00800c 003314 00 AXy  
> > > 0   0  4
> > >     Key to Flags:
> > >     W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
> > >     L (link order), O (extra OS processing required), G (group), T (TLS),
> > >     C (compressed), x (unknown), o (OS specific), E (exclude),
> > >     y (purecode), p (processor specific)
> > >
> > > It was only when I started inserting pure-code test directives into
> > > 'builtin-bitops-1.c' that 'make check' began to report errors.
> > >
> > >     /* { dg-do compile } */
> > >     ...
> > >     /* { dg-options "-mpure-code -mfp16-format=ieee" } */
> > >     /* { dg-final { scan-assembler-not 
> > > "\\.(float|l\\?double|\d?byte|short|int|long|quad|word)\\s+\[^.\]" } } */
> > >
> > > However, for reasons [2] [3] [4] [5], this wasn't actually useful.  It's
> > > sufficient to say that there are many reasons that non-pure-code
> > > compatible functions exist in libgcc.
> >
> > I filed a PR to better track this discussion:
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98779
>
> Possibly worth noting that my patch series addresses __clzsi2, __ctzsi2,
> __aeabi_ldivmod, __aeabi_uldivmod, and most of __gnu_float2h_internal as
> long as CFLAGS_FOR_TARGET contains -mpure-code.
>
Great.

> >
> > >
> > > Although I'm not sure how useful this will be in light of the previous
> > > findings, I did take the opportunity with a working compile process to
> > > modify the relevant assembly functions for -mpure-code compatibility.
> > > I can manually disassemble the library and verify correct compilation.
> > > I can manually run a non-pure-code builtin-bitops-1 with a pure-code
> > > library to verify correct execution.  But, I don't think your standard
> > > regression suite will be able to exercise the new paths.
> > Thanks for doing that. With the SHF_ARM_PURECODE flag you
> > added to clz, I think my simulator would catch problems.
>
> See my more detailed comments following.  Unless you're also using a
> custom linker script, I would have expected any simulator capable of
> catching errors to have already caught them.  Putting the
> SHF_ARM_PURECODE flag on clz actually seems rather cosmetic.
>

Yes, I have a custom linker script with
  .text.noread      :
    {
        INPUT_SECTION_FLAGS (SHF_ARM_PURECODE) *(.text*)
    } > purecode_memory

> > > The patch is below; you can consider this as 34/33 in the series.
> > >
> > > Regards,
> > > Daniel
> > >
> > > [1] It's pretty clear that the section flags in libgcc have never really
> > >     mattered.  When the linker strings all of the used objects together,
> > >     the original sections disappear into a single output object. The
> > >     compiler controls those flags regardless of what libgcc does.)
> > Not sure what you mean? The linker creates two segments, one
> > with and one without the SHF_ARM_PURECODE flag.
>
> When libgcc is compiled "normally", individual objects in libgcc.a are
> compiled _without_ SHF_ARM_PURECODE.  Note line 4 below, with flags
> "AX" only (no "y").
>
> `readelf -S arm-none-eabi/thumb/v6-m/nofp/libgcc/libgcc.a`
>
>     File: arm-none-eabi/thumb/v6-m/nofp/libgcc/libgcc.a(_clzsi2.o)
>     There are 19 section headers, starting at offset 0x43c:
>
>     Section Headers:
>       [Nr] Name              Type            Addr     Off    Size   ES Flg Lk 
> Inf Al
>       [ 0]                   NULL            00000000 000000 000000 00      0 
>   0  0
>       [ 1] .text             PROGBITS        00000000 000034 000000 00  AX  0 
>   0  2
>       [ 2] .data             PROGBITS        00000000 000034 000000 00  WA  0 
>   0  1
>       [ 3] .bss              NOBITS          00000000 000034 000000 00  WA  0 
>   0  1
>   ==> [ 4] .text.sorted[...] PROGBITS        00000000 000034 000034 00  AX  0 
>   0  4
>       ...
>     Key to Flags:
>       W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
>       L (link order), O (extra OS processing required), G (group), T (TLS),
>       C (compressed), x (unknown), o (OS specific), E (exclude),
>       y (purecode), p (processor specific)
>
> When this "normal" libgcc is linked into an -mpure-code program
> (e.g. with RUNTESTFLAGS), the linker script flattens all of the
> sections together into a single output.  The relevant portion of the
> linker script:
>
>       .text           :
>       {
>         *(.text.unlikely .text.*_unlikely .text.unlikely.*)
>         *(.text.exit .text.exit.*)
>         *(.text.startup .text.startup.*)
>         *(.text.hot .text.hot.*)
>         *(SORT(.text.sorted.*)) // _clzsi2.o matches here
>         *(.text .stub .text.* .gnu.linkonce.t.*) // main.o matches here
>         /* .gnu.warning sections are handled specially by elf.em.  */
>         *(.gnu.warning)
>         *(.glue_7t) *(.glue_7) *(.vfp11_veneer) *(.v4_bx)
>       }
>
> I can't pretend to know how the linker merges conflicting flags from the
> various input sections, but the final binary has the attributes
> "AXy" as expected from the top level compile (line 2):
>
> `readelf -Sl builtin-bitops-1.exe`
>
>     There are 22 section headers, starting at offset 0x10934:
>
>     Section Headers:
>       [Nr] Name              Type            Addr     Off    Size   ES Flg Lk 
> Inf Al
>       [ 0]                   NULL            00000000 000000 000000 00      0 
>   0  0
>       [ 1] .init             PROGBITS        00008000 008000 00000c 00  AX  0 
>   0  4
>   ==> [ 2] .text             PROGBITS        0000800c 00800c 00455c 00 AXy  0 
>   0  4
>       [ 3] .fini             PROGBITS        0000c568 00c568 00000c 00  AX  0 
>   0  4
>       [ 4] .rodata           PROGBITS        0000c574 00c574 000050 00   A  0 
>   0  4
>       [ 5] .ARM.exidx        ARM_EXIDX       0000c5c4 00c5c4 000008 00  AL  2 
>   0  4
>       [ 6] .eh_frame         PROGBITS        0000c5cc 00c5cc 000124 00   A  0 
>   0  4
>       [ 7] .init_array       INIT_ARRAY      0001c6f0 00c6f0 000004 04  WA  0 
>   0  4
>       [ 8] .fini_array       FINI_ARRAY      0001c6f4 00c6f4 000004 04  WA  0 
>   0  4
>       [ 9] .data             PROGBITS        0001c6f8 00c6f8 000a30 00  WA  0 
>   0  8
>       [10] .bss              NOBITS          0001d128 00d128 000114 00  WA  0 
>   0  4
>       ...
>     Key to Flags:
>       W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
>       L (link order), O (extra OS processing required), G (group), T (TLS),
>       C (compressed), x (unknown), o (OS specific), E (exclude),
>       y (purecode), p (processor specific)
>
>     There are 3 program headers, starting at offset 52
>
>     Program Headers:
>       Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
>       EXIDX          0x00c5c4 0x0000c5c4 0x0000c5c4 0x00008 0x00008 R   0x4
>       LOAD           0x000000 0x00000000 0x00000000 0x0c6f0 0x0c6f0 R E 
> 0x10000
>       LOAD           0x00c6f0 0x0001c6f0 0x0001c6f0 0x00a38 0x00b4c RW  
> 0x10000
>
>      Section to Segment mapping:
>       Segment Sections...
>        00     .ARM.exidx
>   ==>   01     .init .text .fini .rodata .ARM.exidx .eh_frame
>        02     .init_array .fini_array .data .bss
>
> A segment contains complete sections, not the other way around.  As you
> can see above, the first LOAD segment contains the entire ".text" plus
> some other sections.   Thus, SHF_ARM_PURECODE flag really appears
> to apply to all of "text", even though the bits linked in from libgcc
> weren't built or flagged this way.
>
> >
> > > [2] The existing pure-code tests are compile-only and cover just the
> > >     disassembled 'main.o'.  There is no test of a complete executable
> > >     and there is no execution/simulation.
> > That's something I did manually: run the full gcc testsuite, forcing 
> > -mpure-code
> > in RUNTESTFLAGS. This way, all execution tests are compiled with 
> > -mpure-code,
> > and this is how I found several bugs.
> >
> > > [3] While other parts of binutils may understand SHF_ARM_PURECODE, I
> > >     don't think the simulator checks section flags or throws exceptions.
> > Indeed, I know of no public simulator that honors this flag. I do have
> > one though.
> >
> > > [4] builtin-bitops-1 modified this way will always fail due to the array
> > >     data definitions (longs, longlongs, etc).  GCC can't translate those
> > >     to instructions.  While the ".data" section would presumably be
> > >     readable, scan-assembler-not doesn't know the difference.
> > Sure, adding such scan-assembler-not is not suitable for any existing 
> > testcase.
> > That's why it is only in place for testcases dedicated to -mpure-code.
> >
> > > [5] Even if the simulator were modified to throw exceptions, this will
> > >     continue to fail because _mainCRTStartup uses a literal pool.
> > Yes, in general, and that's why I mentioned problems with newlib
> > earlier in this thread.
> >
> > However, the simulator I use only throws an exception for code executed with
> > SHF_ARM_PURECODE. Code in the "regular" code segment is not checked.
> > So this does not catch errors in hand-written assembly code using regular 
> > .text,
> > but it enables to run larger validations (such as the whole GCC testsuite)
> > without having to fix all of newlib.
> > Not perfect, as it left the issues in libgcc we are discussing, but it
> > helped me fix
> > several bugs in -mpure-code.
>
> Yet again I suspect that you have a custom linker script, or there's
> some other major difference.  Using the public releases of binutils,
> newlib, etc, my experiences just aren't lining up with yours.

Yep, the linker script makes the difference.

>
> > Thanks,
> >
> > Christophe
> >
> > >
> > > > Thanks,
> > > >
> > > > Christophe
> > > >
> > > > --snip--
>
> If the test server farm is free at some point, would you mind running
> another set of regression tests on my v5 patch series?

Sure. Given the number of sub-patches, can you send it to me as a
single patch file
(git format) that I can directly apply to GCC trunk?
My mailer does not want to help with saving each patch as a proper
patch file :-(

Thanks

Christophe

>
> Regards,
> Daniel

Reply via email to