Hi all,

The TODO at the top of gfc_cpp_post_options in gcc/fortran/cpp.cc has
been sitting there for many years:

    /* TODO: allow non-traditional modes, e.g. by -cpp-std=...?  */
    cpp_option->traditional = 1;

This patch finally lifts that restriction.  A new -fno-traditional-cpp
option (off by default, so existing users see no change) selects the
full C preprocessor instead.

To make this safe in the presence of Fortran source, libcpp gains two
new c_lang values (CLK_GNUF77 for fixed-form and CLK_GNUF90 for
free-form Fortran), and the lexer learns three small Fortran rules:

  - "//" is emitted as two CPP_DIV tokens, since in Fortran it is the
    string-concatenation operator and not a C++ line comment,
  - '!' outside of preprocessing directives starts a Fortran line
    comment, so that an unbalanced apostrophe ("John's variable" and
    friends) does not run away as a C character literal,
  - in CLK_GNUF77 only, a column-1 'C', 'c', '*', 'D' or 'd' is
    additionally treated as a comment-line starter.

The driver also forces -C / -CC in non-traditional mode so that the
comments survive into the .i output.  That is not just a cosmetic
choice.  The OpenMP, OpenACC and GCC pragma sentinels ("!$OMP",
"!$ACC", "!GCC$") are lexically Fortran comments but carry directive
semantics that gfortran's scanner must still see.  Dropping them would
silently disable every parallel or offload directive in the program.

The Fortran case '!': body in libcpp/lex.cc is laid out as a near
copy-paste of the existing case '/': body just above.  The
comment-emission tail (fallthrough_comment_p, pfile->cb.comment,
save_comments check, save_comment) is byte-identical between the two.
The only deviations are (a) the dispatch is on language and state
instead of on a look-ahead character (Fortran has no block comments),
and (b) one fortran_line_comment: label sits between the operator
early-break and the comment-handling code, used by the fixed-form
column-1 marker check that lives just before the main switch.

Bootstrapped and regression-tested on x86_64-pc-linux-gnu (Debian
trixie).  The full check completed with no new failures relative to
baseline.  The nine pre-existing FAILs in gcc.sum (six in
gcc.dg/crc-builtin-rev-target{32,64}.c and three in
gcc.target/i386/{pr115102,xchg-4}.c) live in code paths that the
libcpp changes here provably cannot reach.  The new lexer paths are
gated on lang == CLK_GNUF77 || CLK_GNUF90, and the one always-executed
change (save_comment now writing buffer[0] = from[-1]) yields the
same '/' as before for every pre-existing caller.

==== Points I would specifically appreciate review on ====

1. The goto fortran_line_comment in the pre-switch column-1 check
   lands inside case '!': below.  It enters at a top-level statement
   in the case body (between the operator early break and the
   comment_start = buffer->cur; assignment), not into an if-block,
   and no variable initialisations are bypassed.  It is still a goto
   across a case label, though, so an extra pair of eyes is welcome.
   If you would rather have the column-1 path call skip_line_comment
   inline and duplicate the ~12-line tail, I am happy to restructure.

2. The set of column-1 fixed-form comment markers in libcpp/lex.cc
   ('C', 'c', '*', 'D', 'd') intentionally includes the non-standard
   'D' / 'd' debug-line marker.  Standard Fortran 77 only specifies
   'C', 'c' and '*', but gfortran has long accepted 'D' / 'd' as
   comments by default (turned into code via -fd-lines-as-code).  I
   followed gfortran's existing semantics rather than the bare
   standard.  Flag if you would prefer the libcpp side to be stricter
   and to leave the D-line decision to f951.

3. The identifiers CLK_GNUF77 for fixed-form and CLK_GNUF90 free-form
   Fortran were chosen becuase they align nicely with the other
   identifiers, but fixed-form is not exclusive to Fortran 77 and
   free-form is not exclusive to Fortran 90.  Better suggestions are
   welcome.

==== Administrative notes ====

I have not (yet) assigned copyright to the FSF, so this patch is
offered under the Developer Certificate of Origin via the
Signed-off-by trailer in the commit message.  I will start the
assignment process if a maintainer indicates that the patch is even
copyrightable at all.  Right now, I have mostly duplicated code from
the C/C++ line comment code path and added documentation and tests.

For off-list discussion I am reachable on the #gfortran oftc channel
as hmenke (bridged from Matrix #_oftc_#gfortran:matrix.org).  Email is
fine too of course.

Kind regards,
Henri

Henri Menke (1):
  fortran: allow non-traditional C preprocessing via
    -fno-traditional-cpp

 gcc/fortran/cpp.cc                            | 31 ++++++++--
 gcc/fortran/invoke.texi                       | 43 +++++++++++++-
 gcc/fortran/lang.opt                          |  4 ++
 .../gfortran.dg/cpp_non_traditional_1.F90     | 37 ++++++++++++
 .../gfortran.dg/cpp_non_traditional_2.F90     | 23 ++++++++
 .../gfortran.dg/cpp_non_traditional_3.F       | 32 ++++++++++
 .../gfortran.dg/cpp_non_traditional_4.f90     | 13 ++++
 .../gfortran.dg/cpp_non_traditional_5.F90     | 18 ++++++
 libcpp/include/cpplib.h                       |  3 +-
 libcpp/init.cc                                |  4 +-
 libcpp/lex.cc                                 | 59 ++++++++++++++++++-
 11 files changed, 255 insertions(+), 12 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/cpp_non_traditional_1.F90
 create mode 100644 gcc/testsuite/gfortran.dg/cpp_non_traditional_2.F90
 create mode 100644 gcc/testsuite/gfortran.dg/cpp_non_traditional_3.F
 create mode 100644 gcc/testsuite/gfortran.dg/cpp_non_traditional_4.f90
 create mode 100644 gcc/testsuite/gfortran.dg/cpp_non_traditional_5.F90

-- 
2.54.0

Reply via email to