Hi, As mentioned in: https://gcc.gnu.org/pipermail/gcc/2024-August/244581.html
AArch64 cl_optimization_stream_out streams out target-specific optimization options like flag_aarch64_early_ldp_fusion, aarch64_early_ra etc, which breaks AArch64/nvptx offloading, since nvptx cl_optimization_stream_in doesn't have corresponding stream-in for these options and ends up setting invalid values for ptr->explicit_mask (and subsequent data structures). This makes even a trivial test like the following to cause ICE in lto_read_decls with -O3 -fopenmp -foffload=nvptx-none: int main() { int x; #pragma omp target map(x) x; } The attached patch modifies optc-save-gen.awk to generate if (!lto_stream_offload_p) check before streaming out target-specific opt in cl_optimization_stream_out, which fixes the issue. cl_optimization_stream_out after patch (last few entries): bp_pack_var_len_int (bp, ptr->x_flag_wrapv_pointer); bp_pack_var_len_int (bp, ptr->x_debug_nonbind_markers_p); if (!lto_stream_offload_p) bp_pack_var_len_int (bp, ptr->x_flag_aarch64_early_ldp_fusion); if (!lto_stream_offload_p) bp_pack_var_len_int (bp, ptr->x_aarch64_early_ra); if (!lto_stream_offload_p) bp_pack_var_len_int (bp, ptr->x_flag_aarch64_late_ldp_fusion); if (!lto_stream_offload_p) bp_pack_var_len_int (bp, ptr->x_flag_mlow_precision_div); if (!lto_stream_offload_p) bp_pack_var_len_int (bp, ptr->x_flag_mrecip_low_precision_sqrt); if (!lto_stream_offload_p) bp_pack_var_len_int (bp, ptr->x_flag_mlow_precision_sqrt); for (size_t i = 0; i < ARRAY_SIZE (ptr->explicit_mask); i++) bp_pack_value (bp, ptr->explicit_mask[i], 64); For target-specific options, streaming out is gated on !lto_stream_offload_p check. The patch also fixes failures due to same issue with x86_64->nvptx offloading for target-print-1.f90 (and couple more). Does the patch look OK ? Signed-off-by: Prathamesh Kulkarni <prathame...@nvidia.com> Thanks, Prathamesh
[optc-save-gen.awk] Fix streaming of command line options for offloading. The patch modifies optc-save-gen.awk to generate if (!lto_stream_offload_p) check before streaming out target-specific opt in cl_optimization_stream_out, when offloading is enabled. gcc/ChangeLog: * gcc/optc-save-gen.awk: New array var_target_opt. Use it to generate if (!lto_stream_offload_p) check in cl_optimization_stream_out. Signed-off-by: Prathamesh Kulkarni <prathame...@nvidia.com> diff --git a/gcc/optc-save-gen.awk b/gcc/optc-save-gen.awk index a3af88e3776..228efe2accd 100644 --- a/gcc/optc-save-gen.awk +++ b/gcc/optc-save-gen.awk @@ -1307,6 +1307,11 @@ for (i = 0; i < n_opts; i++) { var_opt_optimize_init[n_opt_val] = init; } + # Mark options that are annotated with both Optimization and + # Target so we can avoid streaming out target-specifc opts when + # offloading is enabled. + if (flag_set_p("Target", flags[i])) + var_target_opt[n_opt_val] = 1; n_opt_val++; } } @@ -1384,6 +1389,10 @@ for (i = 0; i < n_opt_val; i++) { } else { sgn = "int"; } + # Do not stream out target-specifc opts if offloading is + # enabled. + if (var_target_opt[i]) + print " if (!lto_stream_offload_p)" # If applicable, encode the streamed value. if (var_opt_optimize_init[i]) { print " if (" var_opt_optimize_init[i] " > (" var_opt_val_type[i] ") 10)";