On Fri, 2015-09-25 at 16:06 +0200, Dodji Seketeli wrote: > Hello, > > Similarly to a comment I made on the previous patch of the series, > > +++ b/libcpp/include/line-map.h > > [...] > > struct GTY(()) location_adhoc_data { > source_location locus; > + source_range src_range; > void * GTY((skip)) data; > }; > > Could we just change the type of locus and make it be a source_range > instead? With the right converting constructor (in the source_range > class) that takes a source_location, the amount of churn should > hopefully be minimized, or maybe I am missing something ...
Thanks. I think that the above is one place where we *would* want both locus and src_range. One key idea within this patch is for source_location/location_t to be able to track both a point/caret/locus and a range containing it. The idea is to stash the range information within the ad-hoc table. For the most simple expressions involving just one token, locus == src_range.m_start, but in the most general case, locus, src_range.m_start and src_range.m_finish are all different from each other; consider this expression: foo && bar ~~~~^~~~~~ (i.e. where the caret/locus is at the first '&' and the range starts at the 'f' of "foo" and finishes at the 'r' of "bar"). As noted elsewhere, we might try to pack short ranges directly into the 32 bits of source_location: https://gcc.gnu.org/ml/gcc-patches/2015-09/msg01826.html which would avoid having to use ad-hoc for such short ranges; ideally most tokens. I'm experimenting with that (I don't have it fully working yet). > [...] > > diff --git a/libcpp/line-map.c b/libcpp/line-map.c > > [...] > > +source_range > +get_range_from_adhoc_loc (struct line_maps *set, source_location loc) > +{ > > Please add a comment for this function. (nods) > [...] > > diff --git a/gcc/tree.c b/gcc/tree.c > > +void > +set_source_range (tree *expr, location_t start, location_t finish) > > Please add a comment for this function and its overloads. (nods) > [...] > > +void > +set_c_expr_source_range (c_expr *expr, > + location_t start, location_t finish) > +{ > > Likewise. (nods) > +location_t > +set_block (location_t loc, tree block) > +{ > > Likewise. I am wondering if we shouldn't even change the name of this > function to better suit what it does: associate a tree to a location. "associate_tree_with_location" ? > + source_range src_range; > + if (IS_ADHOC_LOC (loc)) > + /* FIXME: can we update in-place? */ > + src_range = get_range_from_adhoc_loc (line_table, loc); > > Hmmh, at the moment, I don't think we can update an entry of the adhoc > map that associates {location, range, tree} as all three components > contribute to the hash value of the entry. A new triplet means a new > entry. > > My understanding is that initially the two elements of the pair > {location, tree} were contributing to the hash value because the same > location could very well belong to different blocks. Ah; thanks. > + else > + src_range = source_range::from_location (loc); > + > + return COMBINE_LOCATION_DATA (line_table, loc, src_range, block); > +} > > [...] > > @@ -6091,6 +6112,10 @@ c_parser_conditional_expression (c_parser *parser, > struct c_expr *after, > > if (c_parser_next_token_is_not (parser, CPP_QUERY)) > return cond; > + if (cond.value != error_mark_node) > + start = cond.src_range.m_start; > + else > + start = UNKNOWN_LOCATION; > > I think that "getting the start range of a c_expr" is an operation > that is generally useful; and it's also generally useful for that > operation to handle cases where the tree carried by the c_expr can be > an error_mark_node. So maybe it would be appropriate to have a getter > function for that operation and use it here instead. Good idea; thanks. That will be helpful as I try other representations. > You would then use that operation in the other places of this patch > that are getting c_expr::src_range.start -- by the way, those other > places are not handling the error_mark_node case like the above. > > [...] > > +++ b/gcc/c/c-tree.h > @@ -132,6 +132,9 @@ struct c_expr > The type of an enum constant is a plain integer type, but this > field will be the enum type. */ > tree original_type; > + > + /* FIXME. */ > + source_range src_range; > }; > > Why a FIXME here? To remind myself before posting the patches that I need to give the field a descriptive comment, explaining what purpose it serves. Oops :) It probably should read something like this: /* The source range of this C expression. This might be thought of as redundant, since ranges for expressions can be stored in a location_t, but not all tree nodes in expressions have a location_t. Consider this source code: int test (int foo) { return foo * 100; ^^^ ^^^ } During C parsing, the ranges for "foo" and "100" are stored within this field of c_expr, but after parsing to GENERIC, all we have is a VAR_DECL and an INTEGER_CST (the former's location is in at the top of the function, and the latter has no location). For consistency, we store ranges for all expressions in this field, not just those that don't have a location_t. */ source_range src_range; > +#define CAN_HAVE_RANGE_P(NODE) (CAN_HAVE_LOCATION_P (NODE)) > > Please add a comment for this. > +#define EXPR_LOCATION_RANGE(NODE) (get_expr_source_range (EXPR_CHECK > ((NODE)))) > > Likewise. > +#define EXPR_HAS_RANGE(NODE) \ > + (CAN_HAVE_RANGE_P (NODE) \ > + ? EXPR_LOCATION_RANGE (NODE).m_start != UNKNOWN_LOCATION \ > + : false) > + > > Likewise. > [...] > > +static inline source_range > +get_expr_source_range (tree expr) > > Likewise. > > +#define DECL_LOCATION_RANGE(NODE) \ > + (get_decl_source_range (DECL_MINIMAL_CHECK (NODE))) > + > > Likewise. > > +static inline source_range > +get_decl_source_range (tree decl) > +{ > > Likewise. Thanks Dave [BTW, I'm about to disappear on a vacation from tomorrow until October 6th, and will be away from email and computers]