Re: [RFC WIP] RAW_DATA_CST for #embed optimization

Richard Biener Sun, 07 Jul 2024 09:56:16 -0700

> Am 07.07.2024 um 17:14 schrieb Jakub Jelinek <ja...@redhat.com>:
> 
> On Sun, Jul 07, 2024 at 09:02:57AM +0200, Richard Biener wrote:
>> I see.  I was wondering because PCH includes are not resolved.  That said,
>> it sounds like #embed is sadly defined on The preprocessor side rather
>> than in the language where it would have been easy to constrain uses to
>> those that make sense…
> 
> I think there were big discussions on this and at some stage it has been
> a builtin etc.
> 
>> Yeah, I wondered if where the raw data survives we can make it always
>> wrapped by a CONSTRUCTOR and add a RANGE_TARGET_BYTES element.  This may
>> be useful to encode large initializers more efficiently during/after
>> parsing.
> 
> We definitely should try to improve handling even large initializers which
> do not use #embed eventually, it depends on where all the large overheads
> are where to approach it.
> It could be handled in the preprocessor, say after we see 128 or how many
> CPP_NUMBERs from 0-255 alternating with CPP_COMMA, do some look ahead and
> construct a CPP_EMBED, or it could be done during parsing of initializer
> similarly after seeing certain number of initializers of a CHAR_BIT array
> use the C FE raw token lexing to watch ahead and create RAW_DATA_CST out
> of that if beneficial, etc.
> It really depends on where the biggest overhead is, whether it is in
> creation of the millions of CPP_NUMBER/CPP_COMMA tokens, or primarily
> when creating the large CONSTRUCTOR (the INTEGER_CSTs for the values should
> be shared, at most 256 of them, but the indexes are not).

There’s a very old PR about the regression for very large static initializers 
compared to the time we wrote those directly to asm_out

Richard 

>    Jakub
>
Re: [RFC WIP] RAW_DATA_CST for #embed optimization

Reply via email to