Re: [PATCH 3/5] Implement token range tracking within libcpp and the C FE (v2)

Dodji Seketeli Fri, 25 Sep 2015 09:05:12 -0700

David Malcolm <[email protected]> a écrit:

> On Fri, 2015-09-25 at 11:13 +0200, Dodji Seketeli wrote:
[...]


>> Funny; I first overlooked this comment of you, and then when reading the
>> patch, I asked myself "why keep the existing location_t" ?  I mean, in
>> here:
>> 
>>      /* A preprocessing token.  This has been carefully packed and should
>>     -   occupy 16 bytes on 32-bit hosts and 24 bytes on 64-bit hosts.  */
>>     +   occupy 16 bytes on 32-bit hosts and 24 bytes on 64-bit hosts.
>>     +   FIXME: the above comment is no longer true with this patch.  */
>>      struct GTY(()) cpp_token {
>>        source_location src_loc;      /* Location of first char of token.  */
>>     +  source_range src_range;       /* Source range covered by the token.  
>> */
>>        ENUM_BITFIELD(cpp_ttype) type : CHAR_BIT;  /* token type */
>>        unsigned short flags;         /* flags - see above */
>>  
>> You could just change the type of src_loc and make it be a source_range.
>
> Interesting idea.
>
> For the general case of expressions, I want a location to mean a
> caret/point plus a range that contains it, but for tokens, the
> caret/point is always at the start of the range.  So maybe a src_range
> would do here.
>
> That said, in patches 3 and 4 of this kit I'm experimenting with
> representation; as I said in the blurb for patch 3: "See the
> cover-letter for this patch kit which describes how we might go back to
> using just a location_t, and stashing the range inside the location_t.
> I'm doing it this way for now to allow for more flexibility as I
> benchmark and explore implementation options."

Right.

> So patches 3 and 4 are rather more experimental than patches 1,2 and 5,
> as I find out what the different representations do to the time&memory
> consumption of the compiler.

Understood.

> I like the idea of "source_location" and "location_t" becoming a
> representation of "(point/caret + range)" rather than just a
> point/caret, since this means that we can pass location_t around as
> before, but then we can extract ranges from them, and all of the
> existing diagnostics ought to "automagically" gain underlines "for
> free".

Me too.

>  I'm experimenting with ways to try to do that efficiently, with
> strategies for packing things into the 32-bits compactly; see e.g.:
>  https://gcc.gnu.org/ml/gcc-patches/2015-09/msg01826.html
>
> If so, then cpp_token's src_loc would remain a source_location;
> source_location itself becomes richer.
>
>> Source range could take a converting constructor, that takes a
>> source_location, so that the existing code that does "cpp_token.src_loc
>> = a_source_location;" keeps working.
>> 
>> But then, in the previous patch of the series, I see:
>> 
>> +/* A range of source locations.
>> +
>> +   Ranges are closed:
>> +   m_start is the first location within the range,
>> +   m_finish is the last location within the range.
>> +
>> +   We may need a more compact way to store these, but for now,
>> +   let's do it the simple way, as a pair.  */
>> +struct GTY(()) source_range
>> +{
>> +  source_location m_start;
>> +  source_location m_finish;
>> +
>> +  void debug (const char *msg) const;
>> +
>> +  /* We avoid using constructors, since various structs that
>> +     don't yet have constructors will embed instances of
>> +     source_range.  */
>> +
>> 
>> But what if we define a default constructor for that one (as well as one
>> that takes a source_location)?  Wouldn't that work for the embedding
>> case that you talk about in that comment?
>
> Perhaps, but I worry that it would lead to a cascade, where suddenly
> we'd need constructors for various other types.  I can try it, I
> guess.

If it leads to a cascade, then don't bother.  My point was precisely to
try to avoid the churn, while limiting the amount of data size increase
for cpp_token in general.

As you implied, if we can just stay with a source_location that carries
the information of a pointer plus a range, even better.

> [BTW, I'm about to disappear on a vacation from tomorrow until October
> 6th, and away from computers]

Thanks for the heads-up.

-- 
                Dodji

Re: [PATCH 3/5] Implement token range tracking within libcpp and the C FE (v2)

Reply via email to