C++: show pertinent open token when missing a close token

David Malcolm Tue, 11 Jul 2017 11:32:51 -0700

On Tue, 2017-07-11 at 11:28 -0600, Martin Sebor wrote:
> On 07/11/2017 09:24 AM, David Malcolm wrote:
> > [This patch kit is effectively just one patch; I've split it up
> > into
> >  3 parts, in the hope of making it easier to review:
> >  the c-family parts, the C parts, and the C++ parts]
> > 
> > This patch adds a hint to the user to various errors generated
> > in the C frontend by:
> > 
> >   c_parser_require (parser, CPP_CLOSE_PAREN, "expected %<)%>")
> >   c_parser_skip_until_found (parser, CPP_CLOSE_BRACE, "expected
> > %<}%>")
> > 
> > etc (where there's a non-NULL msgid), and in the C++ frontend by:
> > 
> >   cp_parser_require (parser, CPP_CLOSE_PAREN, RT_CLOSE_PAREN)
> >   cp_parser_require (parser, CPP_CLOSE_BRACE, RT_CLOSE_BRACE)
> > 
> > The hint shows the user where the pertinent open paren or open
> > brace
> > is, which ought to be very helpful for complicated nested
> > collections
> > of parentheses, and somewhat helpful even for simple cases;
> 
> I've played with the patch a bit.  First, let me say that I like
> how it associates the curlies.  I agree that it will be helpful.
> There are other cases where highlighting mismatched or missing
> tokens might be useful, such as for pairs of < and > in complex
> template declarations.


Indeed; braces and parens seemed most useful; my plan is to leave
< > and [ ] for followups.

> But I mainly experimented with it to see if I could get it to
> manifest some of the same symptoms I described in bug 81269.
> I'm not sure it does reproduce the exact same thing or if it's
> a feature, so let me use this as an opportunity to ask.  Given
> something like
> 
>    namespace {
> 
>    enum { e
>    <EOF>
> 
> I see this output:
> 
>    a.C:3:8: error: expected ‘}’ at end of input
>     enum { e
>          ~ ^
>    a.C:3:8: error: expected unqualified-id at end of input
>    a.C:3:8: error: expected ‘}’ at end of input
>    a.C:1:11: note: to match this ‘{’
>     namespace {
>               ^

I think this is an issue with how the diagnostics subsystem chooses
colors; it looks like it's time to rethink that.

Here's what's currently going on:

The color-selection is done in class colorizer in diagnostic-show
-locus.c; the exact colors are in diagnostic-colors.c

> with the first open curly/caret in green, 

This is range 1 within its rich_location, and so colorizer uses state
1, and hence uses the color named "range1", which is implemented as
COLOR_FG_GREEN aka "range1=32" in GCC_COLORS.


> the 'e' in red, and
> the last open curly/caret in cyan.

This is range 0 within its rich_location, and so colorizer uses the
state 0:
      /* Make range 0 be the same color as the "kind" text
         (error vs warning vs note).  */
This diagnostic is an error, and hence it uses the color named "error",
which is bold red.


> Is the green color intended?  And if yes, what is the intent of
> distinguishing it from the red 'e'?  I note that the caret is
> red (and there are no other colors in the output) in this case:
> 
>    namespace { enum {
>    <EOF>
> 
> but becomes green again when I add an enumerator:
> 
>    namespace { enum { e
>    <EOF>

Is there an extra line here that you trimmed?  Presumably in this case
the '{' is being underlined with a "~", and the "e" has the "^" (the
caret)?

In this case, the red is used for that of the caret, and the green for
the secondary location.

I picked this system for coloring the source and annotations in gcc 6
(IIRC), but it seems garish to me now; I think I now favor emulating
what clang does, which is to *not* color the printed source lines, and
to use just one color (green) for underlines and carets.

Can I use PR 81269 for tracking a refresh of how we do colorization in
diagnostics?


> I ask because in the test case in 81269, highlighting the different
> tokens in the three colors seems especially confusing and I'd like
> to better understand if it's intentional (and what it means).
> 
> Incidentally, I tried to make use of this feature in the middle
> end (in gimple-ssa-sprintf.c), to achieve the same effect as
> -Wrestric does, but it led to even stranger-looking results so
> I went back to using plain old warning.   See the attachment to
> bug 81269:
> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41660

Looks like you're seeing a different bug there: you're seeing:

  sprintf (d, "%s%s%s", d, d + 1, d + 2);
  ~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~

with weird-looking colorization of some arguments.

I think what's going on is we have:

primary location:
  sprintf (d, "%s%s%s", d, d + 1, d + 2);
           ^

secondary location 1:
  sprintf (d, "%s%s%s", d, d + 1, d + 2);
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
to cover:               ^
but this PARAM_DECL usage doesn't have a location, and so
it uses the whole of the call

secondary location 1:
  sprintf (d, "%s%s%s", d, d + 1, d + 2);
                           ~~^~~

The bug here I think is that diagnostic_show_locus is printing all of these 
annotation on top of each other, rather than trying to add new lines:

  sprintf (d, "%s%s%s", d, d + 1, d + 2);
           ^                                 (for loc 0)
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~      (for loc 1)
                           ~~^~~             (for loc 2)

or:

  sprintf (d, "%s%s%s", d, d + 1, d + 2);
           ^               ~~^~~             (for locs 0 and 2)
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~      (for loc 1)

Although the ideal here would be for the usages of "d" to have their own 
locations, giving:

  sprintf (d, "%s%s%s", d, d + 1, d + 2);
           ^            ^  ~~^~~

(Not sure if it's worth implementing the extra smarts within 
diagnostic_show_locus vs trying to fix things so that uses of PARM_DELCs have 
their own source locations).


> Martin

[...snip...]

Thanks for trying the patch
Dave

Re: [PATCH 0/3] C/C++: show pertinent open token when missing a close token

Reply via email to