How to improve the location of a gcc diagnostic

David Malcolm Thu, 23 Jun 2016 07:05:37 -0700

A user filed a bug about a bad location in a warning.  It was marked as
an "easyhack" in bugzilla, and I had a go at fixing it.


I though it may be useful for new GCC developers if I document what I
did to fix it.

FWIW, the bug was PR c/71610
  i.e. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71610)
("Improve location for "warning: ISO C restricts enumerator values to
range of 'int' [-Wpedantic]"?").


Step 1: create a minimal reproducer for the diagnostic

This was easy for this bug: I copied it from the bug report.  Try to
avoid #include if possible; start with the simplest possible reproducer
(to minimize effort in the debugger), and work from there.

If you're working on a bug in our bugzilla it's good to click on the
"take" button next to "Assignee" to mark yourself as assignee, and to
set the status to ASSIGNED, so that we don't duplicate effort.


Step 2: create a file under the relevant subdirectory of gcc/testsuite.

This will be under one of gcc.dg, g++.dg or c-c++-common. If possible,
create it under c-c++-common so that we can test both the C and C++
frontends.

I created a file called "pr71610.c" under c-c++-common.  (If this was
for a specific warning, the name of the warning may have been better,
but my case was part of "-Wpedantic", which is too broad for a testcase
title).


Step 3: run the reproducer under DejaGnu

With a freshly built gcc, I ran the following from the "gcc" build
directory:

  make -jN -k && make check-gcc RUNTESTFLAGS="-v -v dg.exp=pr71610*"

changing the "N" in -jN to reflect the number of cores on my box.

This ensures that the compiler is rebuilt (useful when making changes),
and then runs the C compiler testsuite, filtering to just the new tests
that I'd added.

Given that the test case is for c-c++-common, it's also worth running
"make check-g++" to run the C++ compiler on the same tests.

I added DejaGnu directives to the test file to ensure that the test was
run with appropriate command-line flags to trigger the behavior in
question.  In this case, -Wpedantic is needed, and we want to ensure
that the correct source ranges are printed so -fdiagnostics-show-caret
is also needed (otherwise the testsuite defaults to -fno-diagnostics
-show-caret).

Hence I added:
  /* { dg-options "-pedantic -fdiagnostics-show-caret" } */
to the test case.

Given that I wanted to test ranges, I renamed some of the identifiers
in the test case to avoid using 1-character names.  That way it's easy
to verify that all of the identifier is properly underlined (since
finish != the start for a multi-character identifier).

At this point you should have a reproducer that runs within our DejaGnu
-based test suite, and demonstrates the problem.

Adding "-v -v" to RUNTESTFLAGS gives us the command-line invocations of
the test cases.  Copy and paste this and verify that the reproducer can
be run directly from the command-line.  This brings us to...


Step 4: run the reproducer under gdb

Having created a reproducer and identified a command-line invocation
for running it under the testsuite, add the following to the end of the
command-line:

  -wrapper gdb,--args

This will make the gcc driver run "cc1" under gdb.

A good breakpoint location is diagnostic_show_locus.  This is run
whenever a diagnostic is actually emitted (as opposed to, say, warning
and warning_at, which are called many times as the compiler runs, with
most of the calls doing nothing).

  (gdb) break diagnostic_show_locus
  (gdb) run

Hopefully the debugger now hits the breakpoint.  At this point, go up
the stack until you reach the frontend call that emits the diagnostic
(in my case, this was a call to pedwarn).  It's usually helpful to now
put a breakpoint on the actual emission point (to save a step when re
running).


Step 5: figure out where the location is coming from, and improve it

In the debugger you can investigate which location is being used for
the diagnostic, and use a better one.

Many older calls to the diagnostic subsystem implicitly use the
"input_location" global, which is typically the location (and range) of
the token currently being parsed.

In my case, the code was using:

  value_loc = c_parser_peek_token (parser)->location;

which gets the location (and range) of the first token within an
expression.

Usually it's better to display the range of an entire expression rather
than just a token.

Some expression tree nodes have locations, others do not.  I hope to
fix this for gcc 7.  In the meantime, the C frontend has a "struct
c_expr" which captures the range of an expression during parsing (even
for those that don't have location information), and the C++ frontend
has a "class cp_expr" which works similarly, capturing the location and
range (again, only during parsing).  Or you can use:

  EXPR_LOC_OR_LOC (expr, fallback_location)

If you want to emit multiple locations from a test, or emit fix-it
hints, try using the rich_location class (see libcpp/include/line
map.h).


Step 6: add more test cases

Assuming you've made a change to update the location that's used for a
diagnostic, think about what could break, and try to extend the test
case beyond the minimal reproducer to cover other cases.  It's good to
think from two perspectives: developer ("make it work") and QA ("try to
break it").

If you're using -fdiagnostics-show-caret to capture the range
information, you can use dg-{begin|end}-multiline-output to express
testing for this output.  For example:

/* { dg-options "-fdiagnostics-show-caret" } */
void test (void)
{
  enum { c1 = "not int", /* { dg-error "14: enumerator value for .c1.
is not an integer constant" } */
/* { dg-begin-multiline-output "" }
   enum { c1 = "not int",
               ^~~~~~~~~
   { dg-end-multiline-output "" } */

        c_end };
}

It's preferable to consolidate multiple related tests into one source
file, where practical, to minimize the per-source-file overhead when
running the test suite.


Step 7: bootstrap&regression test

Assuming you have a patch that works, follow the usual GCC testing
steps:
  https://gcc.gnu.org/contribute.html
to ensure it doesn't break anything.

FWIW I have a script for building both unpatched ("control") and
patched ("experiment") versions of gcc, bootstrapping and running the
regression-test suite, and comparing the results:
  https://github.com/davidmalcolm/gcc-build

Write a ChangeLog.  I do this whilst it's bootstrapping, using a script
here:
 https://github.com/davidmalcolm/gcc-refactoring-scripts/blob/master/ge
nerate-changelog.py
which makes it easier to generate a ChangeLog (by parsing the output of
"git show").


Step 8: post to gcc-patches

FWIW, my patch for the issue described above is here:
  https://gcc.gnu.org/ml/gcc-patches/2016-06/msg01658.html


Step 9: respond to patch review


Step 10: commit to svn

Once the patch is approved, commit it to svn.  (FWIW I do all of my
development work in git; I have an svn checkout that I use purely for
the final checkin, having smoketested the patch first).


Hope this is helpful
Dave

How to improve the location of a gcc diagnostic

Reply via email to