Re: Better GCC diagnostics

Ian Lance Taylor Fri, 15 Aug 2008 11:54:05 -0700

"Manuel López-Ibáñez" <[EMAIL PROTECTED]> writes:

> 2008/8/15 Ian Lance Taylor <[EMAIL PROTECTED]>:
>> "Manuel López-Ibáñez" <[EMAIL PROTECTED]> writes:
>>
>>> A) Printing the input expression instead of re-constructing it. As
>>>    Joseph explained, this will fix the problems that Aldy mentioned
>>>    (PR3544[123] and PR35742) and this requires:
>>>
>>>   1) For non-preprocessed expr we need at least two locations per expr
>>>      (beg/end). This will require changes on the build_* functions to
>>>      handle multiple locations.
>>
>> This is probably obvious, but can you outline why we need two
>> locations for each expression?  The tools with which I am familiar
>> only print a single caret.  What would use the two locations for?
>
> This has nothing to do with caret diagnostics. This is an orthogonal
> issue that would share some infrastructure as Joseph explained. If you
> do
>
> warning("called object %qE is not a function", expr);
>
> for
>
> ({break;})();
>
> we currently try to re-construct expr and that fails in some cases
> (see the PRs referenced).
>
> #'goto_expr' not supported by pp_c_expression#'bug.c: In function 'foo':
> bug.c:4: error: called object  is not a function
>
> The alternative is to print whatever we parsed when building expr. To
> do that we would need to have begin/end locations for expr, and then
> do a location_t->const char * translation and print whatever is
> between those two pointers:
>
> bug.c:4: error: called object '({break;})' is not a function
> 
>
> Is it clear now? If so, I will update the wiki to put this example.


That is clear.  Thanks.  I personally would be perfectly happy if the
compiler said
    bug.c:4.COLUMN: error: called object is not a function
That is, fixing the compiler to includes parts of the source code in
the error message itself is, for me, of considerably lower priority
than fixing the compiler to generate good column numbers.



>>>      b) Re-open the file and fseek. This is not trivial since we need
>>>         to do it fast but still do all character conversions that we
>>>         did when libcpp opened it the first time. This is
>>>         approximately what Clang (LLVM) does and it seems they can do
>>>         it very fast by keeping a cache of buffers ever reopened. I
>>>         think that thanks to our line-maps implementation, we can do
>>>         the seeking quite more efficiently in terms of computation
>>>         time.  However, opening files is quite embedded into CPP, so
>>>         that would need to be factored out so we can avoid any
>>>         unnecessary CPP stuff when reopening but still do it
>>>         *properly* and *efficiently*.
>>
>> If we are going to reopen the file, then why do we need to record the
>> locations in the preprocessed token stream?
>
> Because for some diagnostics we want to give the warnings in the
> instantiation point not in the macro definition point. Moreover, this
> is what we currently do, so if we don't want to change the current
> behaviour, we need to track both locations.
>
> Example
>
> /*header.h*/
> #pragma GCC system_header
> #define BIG  0x1b27da572ef3cd86ULL
>
> /* file.c */
> #include "pr7263.h"
> __extension__ unsigned long long
> bar ()
> {
>   return BIG;
> }
>
> We print a diagnostic at file.c for the expansion of BIG. However,
> since we do not have the original location we cannot check that the
> token comes from a system header, and we do not suppress the warning.
> There are more subtle bugs that arise from not having the original
> location available. See PR36478.
>
> BTW, Clang takes into account both locations when printing diagnostics.

Perhaps I misunderstand what you mean by recording the location in the
preprocessed token stream.  You evidently do not mean getting column
numbers for the preprocessed code.  You mean that when a preprocessor
macro is expanded, we should record both the location where the macro
is used, and also some sort of reference to the macro so that we know
the location where the macro was defined.  Is that right?


>> If we keep, for each source line, the file offset in the file of the
>> start of that source line, then I think that printing the line from
>> the source file would be pretty fast.  That would not be free but it
>> would be much cheaper than keeping the entire input file.  Various
>
> Cheaper in terms of memory. It cannot be cheaper in terms of
> compilation time than a direct pointer to the already opened buffer
> for each line-map.

Except that we know that increased memory use leads to increased
compile time.  Diagnostic printing can't be slow, but it's also not on
the critical path.  Most code does not generate diagnostics.  So there
is a balance to be struck.

Ian

Re: Better GCC diagnostics

Reply via email to