> -----Original Message-----
> From: Richard Biener <rguent...@suse.de>
> Sent: Tuesday, August 13, 2024 12:52 PM
> To: Andrew Pinski <pins...@gmail.com>
> Cc: Prathamesh Kulkarni <prathame...@nvidia.com>; gcc-
> patc...@gcc.gnu.org; Thomas Schwinge <tschwi...@baylibre.com>
> Subject: Re: [optc-save-gen.awk] Fix streaming of command line options
> for offloading
> 
> External email: Use caution opening links or attachments
> 
> 
> > Am 13.08.2024 um 08:37 schrieb Andrew Pinski <pins...@gmail.com>:
> >
> > On Mon, Aug 12, 2024 at 10:36 PM Prathamesh Kulkarni
> > <prathame...@nvidia.com> wrote:
> >>
> >> Hi,
> >> As mentioned in:
> >> https://gcc.gnu.org/pipermail/gcc/2024-August/244581.html
> >>
> >> AArch64 cl_optimization_stream_out streams out target-specific
> >> optimization options like flag_aarch64_early_ldp_fusion,
> aarch64_early_ra etc, which breaks AArch64/nvptx offloading, since
> nvptx cl_optimization_stream_in doesn't have corresponding stream-in
> for these options and ends up setting invalid values for ptr-
> >explicit_mask (and subsequent data structures).
> >>
> >> This makes even a trivial test like the following to cause ICE in
> lto_read_decls with -O3 -fopenmp -foffload=nvptx-none:
> >>
> >> int main()
> >> {
> >>  int x;
> >>  #pragma omp target map(x)
> >>    x;
> >> }
> >>
> >> The attached patch modifies optc-save-gen.awk to generate if
> >> (!lto_stream_offload_p) check before streaming out target-specific
> opt in cl_optimization_stream_out, which fixes the issue.
> cl_optimization_stream_out after patch (last few entries):
> >>
> >>  bp_pack_var_len_int (bp, ptr->x_flag_wrapv_pointer);
> >> bp_pack_var_len_int (bp, ptr->x_debug_nonbind_markers_p);  if
> >> (!lto_stream_offload_p)  bp_pack_var_len_int (bp,
> >> ptr->x_flag_aarch64_early_ldp_fusion);
> >>  if (!lto_stream_offload_p)
> >>  bp_pack_var_len_int (bp, ptr->x_aarch64_early_ra);  if
> >> (!lto_stream_offload_p)  bp_pack_var_len_int (bp,
> >> ptr->x_flag_aarch64_late_ldp_fusion);
> >>  if (!lto_stream_offload_p)
> >>  bp_pack_var_len_int (bp, ptr->x_flag_mlow_precision_div);  if
> >> (!lto_stream_offload_p)  bp_pack_var_len_int (bp,
> >> ptr->x_flag_mrecip_low_precision_sqrt);
> >>  if (!lto_stream_offload_p)
> >>  bp_pack_var_len_int (bp, ptr->x_flag_mlow_precision_sqrt);  for
> >> (size_t i = 0; i < ARRAY_SIZE (ptr->explicit_mask); i++)
> >>    bp_pack_value (bp, ptr->explicit_mask[i], 64);
> >>
> >> For target-specific options, streaming out is gated on
> !lto_stream_offload_p check.
> >>
> >> The patch also fixes failures due to same issue with x86_64->nvptx
> offloading for target-print-1.f90 (and couple more).
> >> Does the patch look OK ?
> >
> > I think it seems to be on the right track. One thing that is also
> > going to be an issue is streaming in, there could be a target option
> > on the offload side that is marked as Optimization that would might
> > also cause issues. We should check to make sure that also gets fixed
> > here too. Or error out for offloading targets can't have target
> > options with Optimization on them during the build.
Thanks for the suggestions. The attached patch modifies optc-save-gen.awk
to emit an error if accel backend marks target specific option with 
Optimization.
AFAIU, currently neither nvptx nor gcn have target-specific options marked with 
Optimization,
so this is mostly a safeguard against future additions.

cl_optimization_stream_in after patch for target-specifc optimization options:

#ifdef ACCEL_COMPILER
#error accel compiler cannot define Optimization attribute for target-specific 
option x_flag_aarch64_early_ldp_fusion
#else
  ptr->x_flag_aarch64_early_ldp_fusion = (signed char ) bp_unpack_var_len_int 
(bp);
#endif

To test if this works, I added -mfoo to nvptx.opt and marked it with both 
Target and Optimization,
which resulted in the following build error for nvptx:

options-save.cc:13548:2: error: #error accel compiler cannot define 
Optimization attribute for target-specifc option x_flag_nvptx_foo
13548 | #error accel compiler cannot define Optimization attribute for 
target-specific option x_flag_nvptx_foo
      |  ^~~~~
> 
> It may have been misguided to mark target specific flags as
> Optimization.  It might be required to merge those (from all targets)
> into the common optimize enum, like we do for tree codes.  Language
> specific options marked as Optimization possibly have the same issue
> when mixing with other languages and LTO.  Can you assess the
> situation a bit more?
AFAIK, only c-family/c.opt marks few options with Optimization flag. I tried 
marking
fortran's -ffrontend-optimize with Optimization and verified that Optimization 
options
are combined for c-family languages and fortran in 
cl_optimization_stream_{out,in}.

cl_optimization_stream_out shows:
  ...
  bp_pack_var_len_int (bp, ptr->x_flag_frontend_optimize);
  ...
  bp_pack_var_len_int (bp, ptr->x_flag_nothrow_opt);

and likewise has corresponding entries for cl_optimization_stream_in.

So I guess this shouldn't be an issue with lang specific Optimization opts ?

Signed-off-by: Prathamesh Kulkarni <prathame...@nvidia.com>

Thanks,
Prathamesh
> 
> I think the proposed fix looks reasonable but the problem might be
> more widespread and warrant a more global solution or at least
> revisiting documentation?
> 
> Thanks,
> Richard
> 
> > Thanks,
> > Andrew Pinski
> >
> >>
> >> Signed-off-by: Prathamesh Kulkarni <prathame...@nvidia.com>
> >>
> >> Thanks,
> >> Prathamesh
[optc-save-gen.awk] Fix streaming of command line options for offloading.

The patch modifies optc-save-gen.awk to generate if (!lto_stream_offload_p)
check before streaming out target-specific opt in cl_optimization_stream_out,
when offloading is enabled.

gcc/ChangeLog:
        * gcc/optc-save-gen.awk: New array var_target_opt. Use it to generate
        if (!lto_stream_offload_p) check in cl_optimization_stream_out,
        and generate a diagnostic with #error if accelerator backend uses
        Optimization for target-specifc options in cl_optimization_stream_in.

Signed-off-by: Prathamesh Kulkarni <prathame...@nvidia.com>

diff --git a/gcc/optc-save-gen.awk b/gcc/optc-save-gen.awk
index a3af88e3776..b1289c281e7 100644
--- a/gcc/optc-save-gen.awk
+++ b/gcc/optc-save-gen.awk
@@ -1307,6 +1307,11 @@ for (i = 0; i < n_opts; i++) {
                        var_opt_optimize_init[n_opt_val] = init;
                }
 
+               # Mark options that are annotated with both Optimization and
+               # Target so we can avoid streaming out target-specific opts when
+               # offloading is enabled.
+               if (flag_set_p("Target", flags[i]))
+                       var_target_opt[n_opt_val] = 1;
                n_opt_val++;
        }
 }
@@ -1384,6 +1389,10 @@ for (i = 0; i < n_opt_val; i++) {
                } else {
                        sgn = "int";
                }
+               # Do not stream out target-specific opts if offloading is
+               # enabled.
+               if (var_target_opt[i])
+                       print "  if (!lto_stream_offload_p)"
                # If applicable, encode the streamed value.
                if (var_opt_optimize_init[i]) {
                        print "  if (" var_opt_optimize_init[i] " > (" 
var_opt_val_type[i] ") 10)";
@@ -1408,6 +1417,11 @@ print "                           struct cl_optimization 
*ptr ATTRIBUTE_UNUSED)"
 print "{";
 for (i = 0; i < n_opt_val; i++) {
        name = var_opt_val[i]
+        if (var_target_opt[i]) {
+               print "#ifdef ACCEL_COMPILER"
+               print "#error accel compiler cannot define Optimization 
attribute for target-specific option " name;
+               print "#else"
+       }
        otype = var_opt_val_type[i];
        if (otype ~ "^const char \\**$") {
                print "  ptr->" name" = bp_unpack_string (data_in, bp);";
@@ -1427,6 +1441,8 @@ for (i = 0; i < n_opt_val; i++) {
                        print "    ptr->" name" ^= " var_opt_optimize_init[i] 
";";
                }
        }
+       if (var_target_opt[i])
+               print "#endif"
 }
 print "  for (size_t i = 0; i < ARRAY_SIZE (ptr->explicit_mask); i++)";
 print "    ptr->explicit_mask[i] = bp_unpack_value (bp, 64);";

Reply via email to