> I've been trying to nut out a problem for a week or so in a bison parser > I've been writing, and I'm failing. Please excuse me if this isn't the > correct place to ask - I haven't convened with anyone about bison parsers > before.
It is one correct place to ask. If it's more about compilers in general, the Usenet newsgroup `comp.compilers' would be another good place to ask. > I'm writing a parser for Ruby (http://www.ruby-lang.org/). So far, it can > parse a sizeable subset of the language. The problem I'm encountering is > as > follows: > > `obj.method' is a method call on `obj'. This is parsed correctly. > `obj.method(args)' is a method call with arguments given. Also fine. > `obj.method(args).another_method' is a method call to `obj' with some > arguments, then a method call to the return value of that method. This is > parsed incorrectly. > > A shift/reduce conflict occurs where `obj.method(args)' is seen by my > parser. Instead of reducing when it sees the upcoming `.', it shifts, and > it > ends up with the whole `obj.method(args).another_method' on the stack. It may not be what you want, but it looks like correct behavior to me, according to your description. However, it's not clear to me how `yyparse' is supposed to know that `obj.method(args)' is an object that can be used in a call to a member function. I don't know Ruby, but couldn't the return value be any type? This may be a case where you would have to extract this information in an action and make it available to the parser somehow, perhaps by "faking" a token (as I've described on this list ad nauseum). Something that might help you with precedence is the idea of using a hierarchy of expressions, as Donald Knuth did in his METAFONT language. He didn't use Bison (or yacc), but I've used this idea for the parser for GNU 3DLDF, if you want an example. Here's an extract from `parser.output' to give you the idea: 1238 numeric_single: LEFT_PARENTHESIS numeric_expression RIGHT_PARENTHESIS 1110 numeric_atom: numeric_variable 1111 | numeric_token_atom 1112 | numeric_single 1113 numeric_token_atom: numeric_token OVER numeric_token 1114 | numeric_token 1115 numeric_primary: numeric_atom 1116 | numeric_atom LEFT_BRACKET numeric_expression \ COMMA numeric_expression RIGHT_BRACKET 1117 | MAGNITUDE numeric_primary 1118 | LENGTH numeric_primary [...] 1209 numeric_secondary: numeric_primary 1210 | numeric_secondary TIMES numeric_variable 1211 | numeric_secondary numeric_variable 1212 | numeric_token OVER numeric_variable 1213 | numeric_secondary times_or_over numeric_primary 1214 | point_secondary DOT_PRODUCT point_primary 1215 | point_secondary ANGLE point_primary [...] 1229 numeric_tertiary: numeric_secondary 1230 | numeric_tertiary PLUS numeric_secondary 1231 | numeric_tertiary MINUS numeric_secondary 1232 | numeric_tertiary pythagorean_plus_or_minus \ numeric_secondary 1235 numeric_expression: numeric_tertiary Most types don't have "singles", "atoms", or "tokens"; they just have "primaries", "secondaries", "tertiaries" and "expressions". "numerics" need special handling. I chose it for this example because the arithmetic operations apply to it, which should make the principle more clear. The point is that the precedence is implicit in the rules; it is (usually) unnecessary to declare the precedence of operators and it's possible to make fine adjustments with respect to the precedence of the rules. It's a bit more complicated than this, but I won't go into the details unless you need them. Or you could just have a look through `parser.output': http://cvs.savannah.gnu.org/viewvc/3dldf/3dldf/Group/CWEB/parser.output?view=log Laurence Finston _______________________________________________ help-bison@gnu.org http://lists.gnu.org/mailman/listinfo/help-bison