On Fri, Oct 18, 2024 at 11:25 AM David Malcolm <[email protected]> wrote:
> > if (!pfile->cb.diagnostic)
> > abort ();
> > - ret = pfile->cb.diagnostic (pfile, level, reason, richloc,
> > _(msgid), ap);
> > -
> > - return ret;
> > + if (pfile->diagnostic_override_loc && level != CPP_DL_NOTE)
> > + {
> > + rich_location rc2{pfile->line_table, pfile-
> > >diagnostic_override_loc};
> > + return pfile->cb.diagnostic (pfile, level, reason, &rc2,
> > _(msgid), ap);
> > + }
>
> This will effectively override the primary location in the
> rich_location, but by using a second rich_location instance it will
> also ignore any secondary locations and fix-it hints.
>
> This might will be what we want here, but did you consider
> richloc.set_range (0, pfile->diagnostic_override_loc,
> SHOW_RANGE_WITH_CARET);
> to reset the primary location?
>
> Otherwise, looks good to me.
>
> [...snip...]
>
> Thanks
> Dave
>
Thanks for taking a look! My thinking was, when libcpp produces tokens
from a _Pragma string, basically every location_t that it generates is
wrong and shouldn't be used. Because it doesn't actually set up the
line_map, it gets something random that's just sorta close to
reasonable. So I think it makes sense to discard fixits and secondary
locations too.
libcpp does use rich_location pretty sparingly, but I suppose the goal
is to use it more over time. We use one fixit hint for invalid
directive names currently, that one can't show up in a _Pragma though.
Right now we do define an encoding_rich_location subclass for escaping
unprintable bytes, which inherits rich_location and just adds a new
constructor to set the escape flag when it is created. You are
definitely right that this patch as of now loses that information.
Here's a source that uses an improperly normalized character:
_Pragma("ோ")
Currently we output:
t.cpp:1:1: warning: ‘\U00000bc7\U00000bbe’ is not in NFC [-Wnormalized=]
1 | _Pragma("<U+0BC7><U+0BBE>")
| ^~~~~~
With the patch we output:
t.cpp:1:1: warning: ‘\U00000bc7\U00000bbe’ is not in NFC [-Wnormalized=]
1 | _Pragma("ோ")
| ^~~~~~~
So the main location range is improved (it underlines all of _Pragma
instead of most of it), but we have indeed lost the intended feature
that the incorrect bytes are escaped on the source line.
For this particular case I could improve it with a one line addition like
rc2.set_escape_on_output (richloc->escape_on_output_p ());
and that would actually handle all currently needed cases since we
don't use a lot of rich_locations in libcpp. It would just become
stale if some other option gets added to rich_location in the future
that we also should preserve. I think to fix it in a fully general
way, it would be necessary to add a new interface to class
rich_location. It already has a method to delete all the fixit hints,
it would also need a method to delete all the ranges. Then we could
make a copy of the richloc and just delete everything we don't want to
preserve. Do you have a preference one way or the other?
By the way, your suggestion was to directly modify richloc... these
functions do take the richloc by non-const pointer, but is it OK to
modify it or is a copy needed? The functions like cpp_warning_at() are
exposed in the public header file, although at the moment, all call
sites are within libcpp and don't look like they would notice if the
argument was modified. I wasn't sure what is the expected interface
here.
Thanks again...
-Lewis