On Tue, Sep 15, 2015 at 2:08 PM, Manuel López-Ibáñez <lopeziba...@gmail.com> wrote: > On 15/09/15 12:20, Jakub Jelinek wrote: >> >> On Tue, Sep 15, 2015 at 12:14:22PM +0200, Richard Biener wrote: >>>> >>>> diff --git a/gcc/cp/parser.h b/gcc/cp/parser.h >>>> index 760467c..c7558a0 100644 >>>> --- a/gcc/cp/parser.h >>>> +++ b/gcc/cp/parser.h >>>> @@ -61,6 +61,8 @@ struct GTY (()) cp_token { >>>> BOOL_BITFIELD purged_p : 1; >>>> /* The location at which this token was found. */ >>>> location_t location; >>>> + /* The source range at which this token was found. */ >>>> + source_range range; >>> >>> >>> Is it just me or does location now feel somewhat redundant with range? >>> Can't we >>> compress that somehow? >> >> >> For a token I'd expect it is redundant, I don't see how it would be useful >> for a single preprocessing token to have more than start and end >> locations. > > > If memory usage is a concern, can't we easily find out the end location of a > token just by simply re-lexing it from the start location? Many tokens are a > single character. > >> But generally, for expressions, 3 locations make sense. >> If you have >> abc + def >> ~~~~^~~~~ >> then having a range is useful. > > > It seems you want to have a location for '+' plus left-most and right-most > locations. However, we will need the location of 'a' and the location of > 'd', not only the location of 'f'. Thus, we probably want to have (or build) > a range for each operand, to be able to handle something like: > > (a + b) + (c + d) > ~~~~~~~ ^ ~~~~~~~ > > This does not require to track the ranges of every token, but it requires to > track ranges of expressions when building them. Moreover, we want to store > these ranges/locations in the expression node, since many operands > (VAR_DECL, constants, etc) do not have a location. (In my humble opinion, > this a more serious defect of GCC than not tracking a range for tokens > https://gcc.gnu.org/bugzilla/PR43486)
Of course this boils down to "uses" of a VAR_DECL using the shared tree node. On GIMPLE some stmt kinds have separate locations for each operand (PHI nodes), on GENERIC we'd have to invent a no-op expr tree code to wrap such uses to be able to give them distinct locations (can't use sth existing as frontends would need to ignore them in a different way than say NOP_EXPRs or NON_LVALUE_EXPRs). > Note also that we do not necessarily need to track ranges in libcpp to print > ranges in diagnostics. The latter can be implemented and useful before the > former. The example above: > > void foo(void) > { > float c,d; > int * a,b; > (a + b) + (c + d); //error: invalid operands to binary + (have ‘int *’ and > ‘float’) > } > > could be implemented simply by building the ranges while parsing (as I did > in https://gcc.gnu.org/ml/gcc-patches/2009-08/msg00174.html), no need to > store them explicitly. My intuition is that many of the ranges needed by > diagnostics could be dynamically generated from two locations and passed to > the point where it is used (like we do with location_t). We could store > them, but we do not need to. Some examples: > > int y = *SomeA.X; > ^~~~~~~~ > myvec[1]/P; > ~~~~~~~~^~ > struct point origin = { x: 0.0, y: 0.0 }; > ~~ ^ > .x = > > Do we have a place to store the range for "myvec[1]" or for "x:" ? (honest > question). > > Cheers, > > Manuel.