Hi,
As mentioned in:
https://gcc.gnu.org/pipermail/gcc/2024-August/244581.html

AArch64 cl_optimization_stream_out streams out target-specific optimization 
options like flag_aarch64_early_ldp_fusion, aarch64_early_ra etc, which breaks 
AArch64/nvptx offloading,
since nvptx cl_optimization_stream_in doesn't have corresponding stream-in for 
these options and ends up setting invalid values for ptr->explicit_mask (and 
subsequent data structures).

This makes even a trivial test like the following to cause ICE in 
lto_read_decls with -O3 -fopenmp -foffload=nvptx-none:

int main()
{
  int x;
  #pragma omp target map(x)
    x;
}

The attached patch modifies optc-save-gen.awk to generate if 
(!lto_stream_offload_p) check before streaming out target-specific opt in 
cl_optimization_stream_out, which
fixes the issue. cl_optimization_stream_out after patch (last few entries):

  bp_pack_var_len_int (bp, ptr->x_flag_wrapv_pointer);
  bp_pack_var_len_int (bp, ptr->x_debug_nonbind_markers_p);
  if (!lto_stream_offload_p)
  bp_pack_var_len_int (bp, ptr->x_flag_aarch64_early_ldp_fusion);
  if (!lto_stream_offload_p)
  bp_pack_var_len_int (bp, ptr->x_aarch64_early_ra);
  if (!lto_stream_offload_p)
  bp_pack_var_len_int (bp, ptr->x_flag_aarch64_late_ldp_fusion);
  if (!lto_stream_offload_p)
  bp_pack_var_len_int (bp, ptr->x_flag_mlow_precision_div);
  if (!lto_stream_offload_p)
  bp_pack_var_len_int (bp, ptr->x_flag_mrecip_low_precision_sqrt);
  if (!lto_stream_offload_p)
  bp_pack_var_len_int (bp, ptr->x_flag_mlow_precision_sqrt);
  for (size_t i = 0; i < ARRAY_SIZE (ptr->explicit_mask); i++)
    bp_pack_value (bp, ptr->explicit_mask[i], 64);

For target-specific options, streaming out is gated on !lto_stream_offload_p 
check.

The patch also fixes failures due to same issue with x86_64->nvptx offloading 
for target-print-1.f90 (and couple more).
Does the patch look OK ?

Signed-off-by: Prathamesh Kulkarni <prathame...@nvidia.com>

Thanks,
Prathamesh
[optc-save-gen.awk] Fix streaming of command line options for offloading.

The patch modifies optc-save-gen.awk to generate if (!lto_stream_offload_p)
check before streaming out target-specific opt in cl_optimization_stream_out,
when offloading is enabled.

gcc/ChangeLog:
        * gcc/optc-save-gen.awk: New array var_target_opt. Use it to generate
        if (!lto_stream_offload_p) check in cl_optimization_stream_out.

Signed-off-by: Prathamesh Kulkarni <prathame...@nvidia.com>

diff --git a/gcc/optc-save-gen.awk b/gcc/optc-save-gen.awk
index a3af88e3776..228efe2accd 100644
--- a/gcc/optc-save-gen.awk
+++ b/gcc/optc-save-gen.awk
@@ -1307,6 +1307,11 @@ for (i = 0; i < n_opts; i++) {
                        var_opt_optimize_init[n_opt_val] = init;
                }
 
+               # Mark options that are annotated with both Optimization and
+               # Target so we can avoid streaming out target-specifc opts when
+               # offloading is enabled.
+               if (flag_set_p("Target", flags[i]))
+                       var_target_opt[n_opt_val] = 1;
                n_opt_val++;
        }
 }
@@ -1384,6 +1389,10 @@ for (i = 0; i < n_opt_val; i++) {
                } else {
                        sgn = "int";
                }
+               # Do not stream out target-specifc opts if offloading is
+               # enabled.
+               if (var_target_opt[i])
+                       print "  if (!lto_stream_offload_p)"
                # If applicable, encode the streamed value.
                if (var_opt_optimize_init[i]) {
                        print "  if (" var_opt_optimize_init[i] " > (" 
var_opt_val_type[i] ") 10)";

Reply via email to