Re: C preprocessor

Giacinto Cifelli Mon, 17 Aug 2020 10:20:24 -0700

Hi Christian,

On Fri, Aug 14, 2020 at 11:11 AM Christian Schoenebeck <
schoeneb...@crudebyte.com> wrote:


> On Donnerstag, 13. August 2020 07:49:52 CEST Giacinto Cifelli wrote:
> > Hi all,
> >
> > I am wondering if it is possible to interpret a c-preprocessor (the
> second
> > preprocessor, not the one expanding trigrams and removing "\\\n") or an
> m4
> > grammar through bison, and in case if it has already been done.
> > I think  this kind of tool does not produce a type-2 Chomsky grammar,
> > rather a type-1 or even type-0.
>
> The common classification of languages like C I think is "attributed
> context-
> free language", and it is in chomsky-2.
>
> If you just need to handle the preprocessor part, then all you need is a
> lexer
> with stack enabled. A parser (e.g. Bison) only becomes relevant if you also
> need to process the aspects that come after the preprocessor.
>
> > Any idea how to build something like an AST from it?
> >
> > The purpose would be to use in a text editor, to know how to format for
> > example a block between #if/#endif (according to the condition, for
> example
> > could be greyed out if false),
>
> Just to give you a basic idea how this can be done e.g. with Flex, *very*
> roughly (i.e. you have to complete it yourself):
>
>
> /* enable functions yy_push_state(), yy_pop_state(), yy_top_state() */
> %option stack
>
> /* inclusive scanner conditions */
> %s PREPROC_BODY_USE
> /* exclusive scanner conditions */
> %x PREPROC_DEFINE PREPROC_DEFINE_BODY PREPROC_IF PREPROC_BODY_EAT
>
> DIGIT    [0-9]
> ID       [a-zA-Z][a-zA-Z0-9_]*
>
> %%
>
>  /* #define <name> <body> */
>
> <*>"#define"[ \t]* {
>         yy_push_state(PREPROC_DEFINE, yyscanner);
>         yyextra->token = PreprocessorToken(yytext);
>         return PREPROC_TOKEN_TYPE;
> }
>
> <PREPROC_DEFINE>{ID} {
>    yy_pop_state(yyscanner);
>         yy_push_state(PREPROC_DEFINE_BODY, yyscanner);
>         yyextra->macro_name = yytext;
>         yyextra->token = PreprocessorToken(yytext);
>         return PREPROC_TOKEN_TYPE;
> }
>
> <PREPROC_DEFINE_BODY>[^$]* {
>         yy_pop_state(yyscanner);
>         yyextra->token = PreprocessorToken(yytext);
>         yyextra->macro_table[yyextra->macro_name] = yytext;
>         return PREPROC_TOKEN_TYPE;
> }
>
>
>  /*
>     #if <condition>
>          <body>
>     #endif
>  */
>
> <*>#if[ \t]* {
>         yy_push_state(PREPROC_IF, yyscanner);
>         yyextra->token = PreprocessorToken(yytext);
>         return PREPROC_TOKEN_TYPE;
> }
>
> <PREPROC_IF>{ID} {
>         yy_pop_state(yyscanner);
>         if (evaluate(yyextra->macro_table[yytext]))
>                 yy_push_state(PREPROC_BODY_USE, yyscanner);
>         else
>                 yy_push_state(PREPROC_BODY_EAT, yyscanner);
>         yyextra->token = PreprocessorToken(yytext);
>         return PREPROC_TOKEN_TYPE;
> }
>
> <PREPROC_BODY_EAT>.* /* eat up code block filtered out by preprocessor */
>
> <*>.*"#endif" {
>     yy_pop_state(yyscanner);
>     yyextra->token = PreprocessorToken(yytext);
>     return PREPROC_TOKEN_TYPE;
> }
>
>  /* Language keywords */
>
> if|else|const|switch|case|int|unsigned {
>         yyextra->token = KeywordToken(yytext);
>         return KEYWORD_TOKEN_TYPE;
> }
>
>  /* String literal */
>
> \"[^"]*\" {
>     yyextra->token = StringLiteralToken(yytext);
>     return STRING_LITERAL_TYPE;
> }
>
>  /* Number literal */
>
> {DIGIT}+("."{DIGIT}+)? {
>     yyextra->token = NumberLiteralToken(yytext);
>     return NUMBER_LITERAL_TYPE;
> }
>
>  /* Other tokens */
>
> <*>. {
>     yyextra->token = OtherToken(yytext);
>     return OTHER_TOKEN_TYPE;
> }
>
> %%
>
>
Thank you for taking the time to answer, unfortunately this isn't exactly
what I was looking for.
I am more interested in building a structure from a macro syntax than
simply expanding them.

Regards,
Giacinto

Re: C preprocessor

Reply via email to