Re: Inserting extra tokens

Hans Aberg Fri, 18 Aug 2006 00:24:23 -0700

On 17 Aug 2006, at 08:51, Erik Sandberg wrote:

2. When a function argument has been shifted, the parserartificially inserts
a special token as the next token. The token is a COMMA or SEMICOLON
depending on whether the shifted argument was the last argument.

...

A problem in (2) is _how_ to insert the extra token correctly. Ifirst triedto put flex in a state where it sends the desired token withoutreadinganything. This was however not sufficient: In some cases, theparser hasalready read the next token for the lookahead when an argument isshifted.

There is, in general, no good way to insert a token, as the LALR(1)algorithm that Bison uses to create the parser may or may not need alookahead token in each parsing position (i.e., a set of rules, eachof with a dot in it, as in the states of the .output file that Bisoncan write). Thus, one does not know what tokens have been read whenthe parser is in a particular state and its parsing position.

I am working with the parser of GNU LilyPond, and I want to improvethe wayfunction invocations are parsed in the language. The language usesa syntaxfor functions which is somewhat similar to LaTeX's syntax formacros. E.g.,if \foo is declared as a binary function, then \foo a b calls thefunction
with parameters a and b.
I'm looking for a way to express this syntax in bison, in a genericway
(currently only a limited number of function arities are possible).

If one is only implementing Prolog or Haskell style operatorprecedences, there are two methods in use:

If the number or precedences are small, as in Haskell which only hasabout ten, one can list all the parsing possibilities in the .y file,and then type the tokens accordingly. If the number is large, as inProlog, then one gives all the operator tokens one type, like"operator", and lets the .y rule action to put the expressionoperators into a stack. Then, after the expression has been parsed,one lets a C/C++ function to sort out the expression using theoperator precedences.

3. The argument list uses a grammar similar to
arglist: argument COMMA arglist | argument SEMICOLON ;
(an argument can be a complex expression)

Now, I do not see exactly how this precedence problem relates toyours, as you have a different syntax. But if you only admit alimited number of arities, you could list them all in the .y grammar.Otherwise, you will have to use the other method indicated above:create a dynamic arguments list object, and then use the token arityto work it out after the parsing of the rule.


  Hans Aberg




_______________________________________________
help-bison@gnu.org http://lists.gnu.org/mailman/listinfo/help-bison

Re: Inserting extra tokens

Reply via email to