Re: [PATCH] rs6000: Fix up PCH in --enable-host-pie builds [PR115324]

2024-06-03 Thread Segher Boessenkool
Hi!

On Mon, Jun 03, 2024 at 04:55:05PM +0200, Jakub Jelinek wrote:
> PCH doesn't work properly in --enable-host-pie configurations on
> powerpc*-linux*.

PCH and PIE, two of my favourites ;-)

> For PCH though it actually results in saving those huge arrays (one is
> 130832 bytes, another 81568 bytes) into the .gch files and loading them back
> in full.  While the bifname and attr_string and next pointers are marked as
> GTY((skip)), they are actually saved to point to the .rodata and .data
> sections of the process which writes the PCH, but because cc1/cc1plus etc.
> are position independent executables with --enable-host-pie, when it is
> loaded from the PCH file, it can point in a completely different addresses
> where nothing is mapped at all or some random different thing appears at.

Yuck.

> So, either we'd need to add some further GTY extensions, or the following
> patch instead reworks it such that the fntype members which were the only
> reason for PCH in those arrays are moved to separate arrays.

And that just sidesteps the limitation in PCH?

>  void
>  rs6000_init_generated_builtins ()
>  {
> +  bifdata *rs6000_builtin_info_p;
> +  tree *rs6000_builtin_info_fntype_p;
> +  ovlddata *rs6000_instance_info_p;
> +  tree *rs6000_instance_info_fntype_p;
> +  ovldrecord *rs6000_overload_info_p;
> +  __asm ("" : "=r" (rs6000_builtin_info_p) : "0" (rs6000_builtin_info));

Bah.

It should not be called _p of course, it is not a predicate.  And
relying on the operand tie to not have to do a much more obvious
assignment, please don't.  Just *do* write assignments, and then use
a simple "+r"?

But you call this a hack anyway, you wouldn't propose to actually
include this patch :-)

> Bootstrapped/regtested on powerpc64le-linux and powerpc64-linux (-m32/-m64
> testing there), ok for trunk and after a while for release branches?

Yes please.  What nastiness.  Thanks for dealing with it!


Segher


> 2024-06-03  Jakub Jelinek  
> 
>   PR target/115324
>   * config/rs6000/rs6000-gen-builtins.cc (write_decls): Remove
>   GTY markup from struct bifdata and struct ovlddata and remove their
>   fntype members.  Change next member in struct ovlddata and
>   first_instance member of struct ovldrecord to have int type rather
>   than struct ovlddata *.  Remove GTY markup from rs6000_builtin_info
>   and rs6000_instance_info arrays, declare new
>   rs6000_builtin_info_fntype and rs6000_instance_info_fntype arrays,
>   which have GTY markup.
>   (write_bif_static_init): Adjust for the above changes.
>   (write_ovld_static_init): Likewise.
>   (write_init_bif_table): Likewise.
>   (write_init_ovld_table): Likewise.
>   * config/rs6000/rs6000-builtin.cc (rs6000_init_builtins): Likewise.
>   * config/rs6000/rs6000-c.cc (find_instance): Likewise.  Make static.
>   (altivec_resolve_overloaded_builtin): Adjust for the above changes.


[PATCH] rs6000: Fix up PCH in --enable-host-pie builds [PR115324]

2024-06-03 Thread Jakub Jelinek
Hi!

PCH doesn't work properly in --enable-host-pie configurations on
powerpc*-linux*.
The problem is that the rs6000_builtin_info and rs6000_instance_info
arrays mix pointers to .rodata/.data (bifname and attr_string point
to string literals in .rodata section, and the next member is either NULL
or &rs6000_instance_info[XXX]) and GC member (tree fntype).
Now, for normal GC this works just fine, we emit
  {
&rs6000_instance_info[0].fntype,
1 * (RS6000_INST_MAX),
sizeof (rs6000_instance_info[0]),
>_ggc_mx_tree_node,
>_pch_nx_tree_node
  },
  {
&rs6000_builtin_info[0].fntype,
1 * (RS6000_BIF_MAX),
sizeof (rs6000_builtin_info[0]),
>_ggc_mx_tree_node,
>_pch_nx_tree_node
  },
GC roots which are strided and thus cover only the fntype members of all
the elements of the two arrays.
For PCH though it actually results in saving those huge arrays (one is
130832 bytes, another 81568 bytes) into the .gch files and loading them back
in full.  While the bifname and attr_string and next pointers are marked as
GTY((skip)), they are actually saved to point to the .rodata and .data
sections of the process which writes the PCH, but because cc1/cc1plus etc.
are position independent executables with --enable-host-pie, when it is
loaded from the PCH file, it can point in a completely different addresses
where nothing is mapped at all or some random different thing appears at.
While gengtype supports the callback option, that one is meant for
relocatable function pointers and doesn't work in the case of GTY arrays
inside of .data section anyway.

So, either we'd need to add some further GTY extensions, or the following
patch instead reworks it such that the fntype members which were the only
reason for PCH in those arrays are moved to separate arrays.

Size-wise in .data sections it is (in bytes):

 vanillapatched
rs6000_builtin_info  130832 110704
rs6000_instance_info  81568  40784
rs6000_overload_info   7392   7392
rs6000_builtin_info_fntype0  10064
rs6000_instance_info_fntype   0  20392
sum  219792 189336

where previously we saved/restored for PCH those 130832+81568 bytes, now we
save/restore just 10064+20392 bytes, so this change is beneficial for the
data section size.

Unfortunately, it grows the size of the rs6000_init_generated_builtins
function, vanilla had 218328 bytes, patched has 228668.

When I applied
 void
 rs6000_init_generated_builtins ()
 {
+  bifdata *rs6000_builtin_info_p;
+  tree *rs6000_builtin_info_fntype_p;
+  ovlddata *rs6000_instance_info_p;
+  tree *rs6000_instance_info_fntype_p;
+  ovldrecord *rs6000_overload_info_p;
+  __asm ("" : "=r" (rs6000_builtin_info_p) : "0" (rs6000_builtin_info));
+  __asm ("" : "=r" (rs6000_builtin_info_fntype_p) : "0" 
(rs6000_builtin_info_fntype));
+  __asm ("" : "=r" (rs6000_instance_info_p) : "0" (rs6000_instance_info));
+  __asm ("" : "=r" (rs6000_instance_info_fntype_p) : "0" 
(rs6000_instance_info_fntype));
+  __asm ("" : "=r" (rs6000_overload_info_p) : "0" (rs6000_overload_info));
+  #define rs6000_builtin_info rs6000_builtin_info_p
+  #define rs6000_builtin_info_fntype rs6000_builtin_info_fntype_p
+  #define rs6000_instance_info rs6000_instance_info_p
+  #define rs6000_instance_info_fntype rs6000_instance_info_fntype_p
+  #define rs6000_overload_info rs6000_overload_info_p
+
hack by hand, the size of the function is 209700 though, so if really
wanted, we could add __attribute__((__noipa__)) to the function when
building with recent enough GCC and pass pointers to the first elements
of the 5 arrays to the function as arguments.  If you want such a change,
could that be done incrementally?

Bootstrapped/regtested on powerpc64le-linux and powerpc64-linux (-m32/-m64
testing there), ok for trunk and after a while for release branches?

2024-06-03  Jakub Jelinek  

PR target/115324
* config/rs6000/rs6000-gen-builtins.cc (write_decls): Remove
GTY markup from struct bifdata and struct ovlddata and remove their
fntype members.  Change next member in struct ovlddata and
first_instance member of struct ovldrecord to have int type rather
than struct ovlddata *.  Remove GTY markup from rs6000_builtin_info
and rs6000_instance_info arrays, declare new
rs6000_builtin_info_fntype and rs6000_instance_info_fntype arrays,
which have GTY markup.
(write_bif_static_init): Adjust for the above changes.
(write_ovld_static_init): Likewise.
(write_init_bif_table): Likewise.
(write_init_ovld_table): Likewise.
* config/rs6000/rs6000-builtin.cc (rs6000_init_builtins): Likewise.
* config/rs6000/rs6000-c.cc (find_instance): Likewise.  Make static.
(altivec_resolve_overloaded_builtin): Adjust for the above changes.

--- gcc/config/rs6000/rs6000-gen-builtins.cc.jj 2024-05-31 22:09:39.300059155 
+0200
+++ gcc/config/rs600