Re: can i rewrite yyparse() based on ACTION and GOTO tables extracted from XML output?

2021-11-06 Thread David M. Warme

Christian, Levin, Akim and all,

Keep in mind that most of the CPU time is usually spent on lexer side. So it 
would make sense to optimize that first. However that's one of the issues 
where you quickly realize that a split parser/lexer design is in the way. If 
they were tied together, you could make much more aggressive optimizations by 
having knowledge of both grammar and RegEx patterns.


For instance right now, whenever you have some input, the lexer must always 
assume that any of the patterns may match. Whereas a tied grammar/lexer would 
have the knowledge that for a certain parser state, only a small set of 
patterns may match, which in turn would allow much more compact and faster 
lexer branches for those states to be generated.


The point that Akim is making is that this would yield a combinatorial
expansion (explosion?) of the state space.  Whether done in tables or
code, you would lose enough locality that cache and TLB misses would
likely become an issue -- at least on realistically large grammars.

Then consider that lexers don't just recognize an identifier, they
usually construct an object to hold it (or look one up in a hash
table), storing the pointer, e.g., into yylval.  How many copies
of this Flex "action code" do you want?  Hint: you had better
make it be a function with carefully-defined calling convention, or
the code bloat will be fierce.  Better yet, compose a separate
function for each lexer action automagically.  Flex naturally gives
you just one copy of this code.

Also remember that the tables used by many lexers perform multi-way
branches on each character, consolidating many potential pipeline
flushes into one.  The choice of whether to code these as
if-then-else trees or switch-statement branch tables is a complex
trade off between code size and pipeline flushes.  And there is
the question of whether branch-prediction logic can maintain a low
misprediction rate in the face of such a large flow graph containing
a huge number of cyclic paths, few of which offer any extended
number of iterations (e.g., when parsing the initializers for
Bison parse tables! :^).  As Akim said, it really isn't clear how
well this approach would pay off on modern architectures.

This could also really hurt the total code size on systems like
the one I work on that have 150+ grammars.

The good news is that GCC offers pretty much all of the tools
you might need to automatically render a Bison state machine
into "inline" C code:

 - Define a "struct" containing all of the data structures
   (state/value/location stack bases, pointer and sizes,
   yylval, yylloc, error recovery stuff, etc.).  This can be
   passed to various functions with a single pointer argument.
   The one instance lives on the stack frame of yyparse().

 - Make a _yylex(ptr) wrapper function taking this one pointer,
   that calls yylex(), and properly adapts its calling convention.
   (For example, use the re-entrant yylex() conventions.)
 
 - Using && to take the address of labels and "goto *pointer;"

   could really help.  Define one label to "shift" into a state,
   a second label to "goto" that state after a reduction.
   Only the second kind of label need appear on the state
   stack.  Both of these need multi-way dispatch: the "shift"
   portion on tokens, the "goto" portion on non-terminals.

 - Code the "shift" and "goto" dispatch of all states as
   switch statements, letting GCC choose between branch tree
   jump table, or combination.  The "default:" for the
   "shift" switch must always be to error recovery.  The
   "default:" for the "goto" switch could be the most
   frequent destination state, whose separate cases are
   then omitted.  In states where the only action is a
   single reduce, the "shift" switch could be replaced with
   a single reduce action.

 - Encapsulate the "action code" for each rule in a separate
   function, since many of these would be called from many
   places.  This would be an excellent place to pop the
   stack pointers, since each rule knows how many symbols
   appear on its own right-hand-side (thereby eliminating
   the yyr2 table).  These functions should also return the
   right-hand-side symbol reduced, becoming input when
   entering the "goto" label of the popped state
   (eliminating yyr1 table).

 - Note that the numbering scheme for symbols returned
   by these action functions are used only to perform the
   proper "goto" function for the popped state.  It is
   entirely possible that a different numbering of the
   various non-terminals might produce more contiguous
   jump tables in each state's "goto" code.  (Interesting
   combinatorial optimization problem / open research
   problem.)

 - Some code (e.g., to reallocate the stacks when about to
   overflow) should be coded as functions that can easily
   be called from many places (every shift and goto).

I would start with just a Bison-level state machine before
trying to combine parser and lexer with such 

Re: can i rewrite yyparse() based on ACTION and GOTO tables extracted from XML output?

2021-11-06 Thread Christian Schoenebeck
On Samstag, 6. November 2021 09:44:55 CET Akim Demaille wrote:
> Hi Levin,
> 
> > Le 28 oct. 2021 à 08:24, elite liu  a écrit :
> > 
> > Dear Akim,
> > 
> >I am writing a utility that convert a table-driven dfa into
> > 
> > condition/if-else form, for example:
> > [...]
> 
> I have always wondered if code won't be more efficient than tables.
> Since compiler and cpu have learned a lot of tricks to make code
> faster, I wouldn't be surprised to have some speedup.
> 
> > If above codes are writen in tabled-driven implementation, it should be
> > more concise.
> 
> Sure.  And compactness is also cache friendliness.  So I agree
> it is very much unclear which approach will beat the other one.
> On today's machines.
> 
> Cheers.

The intended approach would still be a character by character procedure which 
is much slower than allowing a lexer to perform string comparisons as much as 
possible:

https://lemire.me/blog/2020/04/30/for-case-insensitive-string-comparisons-avoid-char-by-char-functions/

Keep in mind that most of the CPU time is usually spent on lexer side. So it 
would make sense to optimize that first. However that's one of the issues 
where you quickly realize that a split parser/lexer design is in the way. If 
they were tied together, you could make much more aggressive optimizations by 
having knowledge of both grammar and RegEx patterns.

For instance right now, whenever you have some input, the lexer must always 
assume that any of the patterns may match. Whereas a tied grammar/lexer would 
have the knowledge that for a certain parser state, only a small set of 
patterns may match, which in turn would allow much more compact and faster 
lexer branches for those states to be generated.

Best regards,
Christian Schoenebeck





Re: can i rewrite yyparse() based on ACTION and GOTO tables extracted from XML output?

2021-11-06 Thread Akim Demaille
Hi Levin,

> Le 28 oct. 2021 à 08:24, elite liu  a écrit :
> 
> Dear Akim,
> 
>I am writing a utility that convert a table-driven dfa into
> condition/if-else form, for example:
> [...]

I have always wondered if code won't be more efficient than tables.
Since compiler and cpu have learned a lot of tricks to make code
faster, I wouldn't be surprised to have some speedup.


> If above codes are writen in tabled-driven implementation, it should be
> more concise.

Sure.  And compactness is also cache friendliness.  So I agree
it is very much unclear which approach will beat the other one.
On today's machines.

Cheers.


Re: can i rewrite yyparse() based on ACTION and GOTO tables extracted from XML output?

2021-10-28 Thread elite liu
Dear Akim,

I am writing a utility that convert a table-driven dfa into
condition/if-else form, for example:

/***/
enum State {
S0,
S1,
S2,
S3,
S4,
S5,
S6,
ERR
};
State state_trans(State s, char input) {
State ret = ERR;
switch(s) {
case S0: {
if (input == 'a') {
ret = S1;
}
else if (input == 'b') {
ret = S2;
}
else if (input == 'c') {
ret = S6;
}
else if (input == 'd') {
;
}
break;
}
case S1: {
if (input == 'a') {
ret = S6;
}
else if (input == 'b') {
ret = S5;
}
else if (input == 'c') {
ret = S4;
}
else if (input == 'd') {
ret = S1;
}
break;
}
case S2: {
if (input == 'a') {
ret = S3;
}
else if (input == 'b') {
ret = S4;
}
else if (input == 'c') {
ret = S5;
}
else if (input == 'd') {
ret = S4;
}
break;
}
case S3: {
if (input == 'a') {
ret = S4;
}
else if (input == 'b') {
ret = S3;
}
else if (input == 'c') {
ret = S5;
}
else if (input == 'd') {
ret = S1;
}
break;
}
case S4: {
if (input == 'a') {
ret = S2;
}
else if (input == 'b') {
ret = S4;
}
else if (input == 'c') {
ret = S5;
}
else if (input == 'd') {
ret = S4;
}
break;
}
case S5: {
if (input == 'a') {
ret = S0;
}
else if (input == 'b') {
ret = S2;
}
else if (input == 'c') {
ret = S4;
}
else if (input == 'd') {
ret = S6;
}
break;
}
case S6: {
if (input == 'a') {
ret = S1;
}
else if (input == 'b') {
ret = S3;
}
else if (input == 'c') {
ret = S5;
}
else if (input == 'd') {
ret = S4;
}
break;
}
case ERR:
break;
}
return ret;
}
/***/

If above codes are writen in tabled-driven implementation, it should be
more concise.
I convert table-driven implementation into state and lookahead judgement
form, because i want to capture FSM coverage.
A simple walkaround is making use of AFL (an edge coverage monitor tool
which based on branch instrumentation):

afl-clang-fast -c test.cpp [+] Instrumented 57 locations (non-hardened
mode, ratio 100%).

For example, it's difficult to instrument mysql DBMS because of 4.5 Million
of source codes. It will run several days to cover maximum edge coverage.
However, it's quite easy to capture the parse state FSM coverage based on
this approach.

I will implement it after three or four work days. Regards,
Levin

Akim Demaille  于2021年10月20日周三 下午12:01写道:

> Hi Levin,
>
> > Le 14 oct. 2021 à 12:30, elite liu  a écrit :
> >
> > Hi Guys:
> >
> > Because of customization necessaries, i want to rewrite yyparse() based
> on
> > ACTION and GOTO tables -- a more straightforward way, just like the
> classic
> > compiler book does, and ignore error recovery.
> > bison is more complex due to efficient consideration.
>
> Could you be more specific about what you'd like to change?  Also,
> what's the problem with error recovery?  It should not come in your
> way, it's fairly independent.
>
>
> > I notice  bison can dump XML-format output. In XML output file, symbols
> > both terminals and nonterminals, rules are record.  shifts, reduce , and
> > goto actions, conflicts are fully listed.
> > But i don't know whether the information are complete enough to create a
> > big ACTION table and GOTO table based on xml extraction, where conflicts
> > are solved like precedence,  %nonassoc, and %right and %prec can be well
> > treated from the table guided actions.  I try to store ACTION table and
> > GOTO table using std::map. Efficiency is not a focus.
>
> I very much doubt that it's enough to build a parser, and I have no
> intention
> to make it complete enough to this end.  The point of xml output is only
> helping the (human) user to understand the parser.  It's just a structured
> y.output.
>
> However Bison features "skeletons": the backend are written in m4, failry
> independent from the bison executable.  Have a look at
> data/skeletons/yacc.c
> for instance.
>
> Cheers.


Re: can i rewrite yyparse() based on ACTION and GOTO tables extracted from XML output?

2021-10-19 Thread Akim Demaille
Hi Levin,

> Le 14 oct. 2021 à 12:30, elite liu  a écrit :
> 
> Hi Guys:
> 
> Because of customization necessaries, i want to rewrite yyparse() based on
> ACTION and GOTO tables -- a more straightforward way, just like the classic
> compiler book does, and ignore error recovery.
> bison is more complex due to efficient consideration.

Could you be more specific about what you'd like to change?  Also,
what's the problem with error recovery?  It should not come in your
way, it's fairly independent.


> I notice  bison can dump XML-format output. In XML output file, symbols
> both terminals and nonterminals, rules are record.  shifts, reduce , and
> goto actions, conflicts are fully listed.
> But i don't know whether the information are complete enough to create a
> big ACTION table and GOTO table based on xml extraction, where conflicts
> are solved like precedence,  %nonassoc, and %right and %prec can be well
> treated from the table guided actions.  I try to store ACTION table and
> GOTO table using std::map. Efficiency is not a focus.

I very much doubt that it's enough to build a parser, and I have no intention
to make it complete enough to this end.  The point of xml output is only
helping the (human) user to understand the parser.  It's just a structured
y.output.

However Bison features "skeletons": the backend are written in m4, failry
independent from the bison executable.  Have a look at data/skeletons/yacc.c
for instance.

Cheers.