[Bug target/106671] aarch64: BTI instruction are not inserted for cross-section direct calls

2024-07-16 Thread julien.voisin+gnu at dustri dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106671

--- Comment #19 from jvoisin  ---
> That's not a good reason to weaken the security of the generated code.

Having BTI will more valid targets is still better than no BTI at all, and it
would still be better than what clang is doing.

[Bug target/106671] aarch64: BTI instruction are not inserted for cross-section direct calls

2023-08-21 Thread broonie at kernel dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106671

--- Comment #18 from Mark Brown  ---
It's section placement stuff that's triggering this. You will also be able to
build a larger kernel if you try, though I'm not sure that's practical.

[Bug target/106671] aarch64: BTI instruction are not inserted for cross-section direct calls

2023-08-21 Thread wilco at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106671

--- Comment #17 from Wilco  ---
(In reply to Mark Brown from comment #13)
> The kernel hasn't got any problem with BTI as far as I am aware - when built
> with clang we run the kernel with BTI enabled since clang does just insert a
> BTI C at the start of every function, and GCC works fine so long as we don't
> get any out of range jumps being generated. The issue is that we don't have
> anything to insert veneers in the case where section placement puts static
> functions into a distant enough part of memory to need an indirect jump but
> GCC has decided to omit the landing pad.

Is the kernel already larger than 128 MBytes .text? Or do people do weird stuff
with section placement that causes branches to be out of range?

[Bug target/106671] aarch64: BTI instruction are not inserted for cross-section direct calls

2023-08-21 Thread rearnsha at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106671

--- Comment #16 from Richard Earnshaw  ---
(In reply to Mark Brown from comment #15)
> The kernel module loader simply does not insert veneers at present, and
> there were some implementation concerns IIRC.

That's not a good reason to weaken the security of the generated code.

[Bug target/106671] aarch64: BTI instruction are not inserted for cross-section direct calls

2023-08-21 Thread broonie at kernel dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106671

--- Comment #15 from Mark Brown  ---
The kernel module loader simply does not insert veneers at present, and there
were some implementation concerns IIRC.

[Bug target/106671] aarch64: BTI instruction are not inserted for cross-section direct calls

2023-08-21 Thread rearnsha at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106671

--- Comment #14 from Richard Earnshaw  ---
(In reply to Mark Brown from comment #13)
> The kernel hasn't got any problem with BTI as far as I am aware - when built
> with clang we run the kernel with BTI enabled since clang does just insert a
> BTI C at the start of every function, and GCC works fine so long as we don't
> get any out of range jumps being generated. The issue is that we don't have
> anything to insert veneers in the case where section placement puts static
> functions into a distant enough part of memory to need an indirect jump but
> GCC has decided to omit the landing pad.

The linker has to insert the veneers.

[Bug target/106671] aarch64: BTI instruction are not inserted for cross-section direct calls

2023-08-15 Thread broonie at kernel dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106671

--- Comment #13 from Mark Brown  ---
The kernel hasn't got any problem with BTI as far as I am aware - when built
with clang we run the kernel with BTI enabled since clang does just insert a
BTI C at the start of every function, and GCC works fine so long as we don't
get any out of range jumps being generated. The issue is that we don't have
anything to insert veneers in the case where section placement puts static
functions into a distant enough part of memory to need an indirect jump but GCC
has decided to omit the landing pad.

[Bug target/106671] aarch64: BTI instruction are not inserted for cross-section direct calls

2023-08-15 Thread nsz at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106671

--- Comment #12 from nsz at gcc dot gnu.org ---
(In reply to Jiangning Liu from comment #11)
> Hi Wilco,
> 
> > "it means we will need a linker optimization to remove those redundant BTIs 
> > (eg. by changing them into NOPs)"
> 
> It will be only for performance optimization, right? If we don't care about
> performance, the linker doesn't need to optimize it to be NOP, right? It
> could still be useful if we only do this operation for a specific module.

no, this is a security feature, we want as few BTI c in an executable
segment as possible.

[Bug target/106671] aarch64: BTI instruction are not inserted for cross-section direct calls

2023-08-14 Thread jiangning.liu at amperecomputing dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106671

--- Comment #11 from Jiangning Liu  
---
Hi Wilco,

> "it means we will need a linker optimization to remove those redundant BTIs 
> (eg. by changing them into NOPs)"

It will be only for performance optimization, right? If we don't care about
performance, the linker doesn't need to optimize it to be NOP, right? It could
still be useful if we only do this operation for a specific module.

Thanks,
-Jiangning

[Bug target/106671] aarch64: BTI instruction are not inserted for cross-section direct calls

2023-08-11 Thread wilco at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106671

Wilco  changed:

   What|Removed |Added

 CC||wilco at gcc dot gnu.org

--- Comment #10 from Wilco  ---
(In reply to Feng Xue from comment #9)
> On some occasions, we may not use the new ld, the kernel-building relies on
> its own runtime linker which is used for kernel modules. So I created a
> patch (https://gcc.gnu.org/pipermail/gcc-patches/2023-August/626084.html),
> and this provides user another option that could be done at the compiler
> side.

Reducing BTI is important for security. With LTO a binary should only have BTI
on functions that are indirectly called. So I don't like the idea of adding
more BTI with a new option - it means we will need a linker optimization to
remove those redundant BTIs (eg. by changing them into NOPs).

Note that branch offsets up to 256MB don't need special veneer handling: one
should place a direct branch about halfway to the destination.

Does Linux do any weird hacks in -fpatchable-function-entry that makes it hard
to use BTI?

[Bug target/106671] aarch64: BTI instruction are not inserted for cross-section direct calls

2023-08-02 Thread fxue at os dot amperecomputing.com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106671

Feng Xue  changed:

   What|Removed |Added

 CC||fxue at os dot 
amperecomputing.com

--- Comment #9 from Feng Xue  ---
On some occasions, we may not use the new ld, the kernel-building relies on its
own runtime linker which is used for kernel modules. So I created a patch
(https://gcc.gnu.org/pipermail/gcc-patches/2023-August/626084.html), and this
provides user another option that could be done at the compiler side.

[Bug target/106671] aarch64: BTI instruction are not inserted for cross-section direct calls

2023-03-23 Thread broonie at kernel dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106671

--- Comment #8 from Mark Brown  ---
Note that the issue was found in the Linux kernel - we were expecting to see
the BTI Cs there, it's certainly a lot simpler to work with.

[Bug target/106671] aarch64: BTI instruction are not inserted for cross-section direct calls

2023-03-23 Thread nsz at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106671

nsz at gcc dot gnu.org changed:

   What|Removed |Added

 CC||nsz at gcc dot gnu.org
 Status|NEW |WAITING

--- Comment #7 from nsz at gcc dot gnu.org ---
fixed in bfd ld 2.41 see
https://sourceware.org/bugzilla/show_bug.cgi?id=30076

we can also fix gcc to work with older ld (emit bti c in local functions), but
i don't plan to do that unless there is a reason to do so. (it increases the
emitted bti c considerably in some workloads, e.g. linux kernel, while the
linker fix is less intrusive in the common case with small binaries and no
weird section hacks).

[Bug target/106671] aarch64: BTI instruction are not inserted for cross-section direct calls

2022-08-18 Thread rearnsha at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106671

--- Comment #6 from Richard Earnshaw  ---
(In reply to Richard Earnshaw from comment #5)
> (In reply to D Scott Phillips from comment #2)
> > th(In reply to Andrew Pinski from comment #1)
> > > Shouldn't the linker add the BTI inside the ___veneer instead?
> > 
> > The bti instruction has to be placed at the target of the indirect branch
> > (at the top of `func` in this case) so I don't think it would be possible to
> > work around this just within the veneer.
> 
> The veneer has to be placed 'near' the target and then end with a direct
> branch instruction.  The linker should be able to work this out.

This might, of course, mean that two veneers are needed in this case, one that
can be reached from the initial branch, and one that can reach the final
target.  A direct branch will jump to the first and the second one will be
reached by an indirect jump (needing a BTI at the start).

[Bug target/106671] aarch64: BTI instruction are not inserted for cross-section direct calls

2022-08-18 Thread rearnsha at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106671

--- Comment #5 from Richard Earnshaw  ---
(In reply to D Scott Phillips from comment #2)
> th(In reply to Andrew Pinski from comment #1)
> > Shouldn't the linker add the BTI inside the ___veneer instead?
> 
> The bti instruction has to be placed at the target of the indirect branch
> (at the top of `func` in this case) so I don't think it would be possible to
> work around this just within the veneer.

The veneer has to be placed 'near' the target and then end with a direct branch
instruction.  The linker should be able to work this out.

[Bug target/106671] aarch64: BTI instruction are not inserted for cross-section direct calls

2022-08-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106671

--- Comment #4 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #3)
> Basically:
> void
> aarch64_print_patchable_function_entry (FILE *file,
> unsigned HOST_WIDE_INT
> patch_area_size,
> bool record_p)
> {
>   if (cfun->machine->label_is_assembled
>   && aarch64_bti_enabled ()
>   && !cgraph_node::get (cfun->decl)->only_called_directly_p ())
> 
> 
> That last check just needs to be removed as there is no way to know if the
> linker will output a veneer.

That only fixes the -fpatchable-function-entry= case.

aarch64-bti-insert.cc needs to be fixed too:
  /* Since a Branch Target Exception can only be triggered by an indirect call,
 we exempt function that are only called directly.  We also exempt
 functions that are already protected by Return Address Signing (PACIASP/
 PACIBSP).  For all other cases insert a BTI C at the beginning of the
 function.  */
  if (!cgraph_node::get (cfun->decl)->only_called_directly_p ())

[Bug target/106671] aarch64: BTI instruction are not inserted for cross-section direct calls

2022-08-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106671

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2022-08-17
 Status|UNCONFIRMED |NEW

--- Comment #3 from Andrew Pinski  ---
Basically:
void
aarch64_print_patchable_function_entry (FILE *file,
unsigned HOST_WIDE_INT patch_area_size,
bool record_p)
{
  if (cfun->machine->label_is_assembled
  && aarch64_bti_enabled ()
  && !cgraph_node::get (cfun->decl)->only_called_directly_p ())


That last check just needs to be removed as there is no way to know if the
linker will output a veneer.

[Bug target/106671] aarch64: BTI instruction are not inserted for cross-section direct calls

2022-08-17 Thread scott at os dot amperecomputing.com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106671

--- Comment #2 from D Scott Phillips  ---
th(In reply to Andrew Pinski from comment #1)
> Shouldn't the linker add the BTI inside the ___veneer instead?

The bti instruction has to be placed at the target of the indirect branch (at
the top of `func` in this case) so I don't think it would be possible to work
around this just within the veneer.

[Bug target/106671] aarch64: BTI instruction are not inserted for cross-section direct calls

2022-08-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106671

--- Comment #1 from Andrew Pinski  ---
Shouldn't the linker add the BTI inside the ___veneer instead?