Dennis, Thank you for all the feedback. I RTFM'ed a lot and have a better understanding of how ply works.
I am running into one more problem, I'm having trouble figuring out how to handle looping events? The only thing I can think of is building a stack of the lines of code within the loops, then run that through a new instance of yacc for each itteration of the loop. Could you or someone else please elaborate how I would handle looping symantex in ply? Thanks, Ben On Thu, Feb 12, 2009 at 3:44 AM, D.Hendriks (Dennis) <[email protected]>wrote: > Hello Ben, > > If you don't specify the start symbol, the symbol name of the first rule > (in the file) is used. In the original case, it was 'statement', because the > first rule in the file is 'statement : VARIABLE EQUALS expression SEMI'. At > least that is what I think happens (if I remember correctly from reading the > PLY documentation). I always explicitly supply the start symbol to the yacc > method. See the PLY documentation for more information. If you take a look > at the generated parser.out file, you see: > > Grammar > > Rule 1 S' -> statement > Rule 2 statement -> VARIABLE EQUALS expression SEMI > Rule 3 statement -> expression > Rule 4 expression -> expression PLUS expression > Rule 5 expression -> expression MINUS expression > Rule 6 expression -> expression TIMES expression > Rule 7 expression -> expression DIVIDE expression > Rule 8 expression -> NUMBER > Rule 9 expression -> VARIABLE > > meaning that indeed 'statement' is the start symbol, looking at the actual > start symbol S'. > > PLY generates parsing tables. When you supply input to the parse method, > PLY will try to parse it. It will try to 'match' it to the start symbol. So, > it will try to match a 'statement'. If you want to see what happens > internally, supply debug=2 as parameter to the parse method. Also, you may > want to check out the parser.out file to see the parse tables information in > human readable format. If you look at the debug output (with debug=2), you > get (I use PLY 3.0, but you get something similar in PLY 2.5): > > PLY: PARSE DEBUG START > > State : 0 > Stack : . LexToken(VARIABLE,'$foo',1,0) > Action : Shift and goto state 3 > > State : 3 > Stack : VARIABLE . LexToken(EQUALS,'=',1,5) > Action : Shift and goto state 5 > > State : 5 > Stack : VARIABLE EQUALS . LexToken(NUMBER,1,1,7) > Action : Shift and goto state 1 > > State : 1 > Stack : VARIABLE EQUALS NUMBER . LexToken(SEMI,';',1,8) > Action : Reduce rule [expression -> NUMBER] with [1] and goto state 7 > expression_number > Result : <int @ 0x813ef18> (1) > > State : 11 > Stack : VARIABLE EQUALS expression . LexToken(SEMI,';',1,8) > Action : Shift and goto state 16 > > State : 16 > Stack : VARIABLE EQUALS expression SEMI . LexToken(VARIABLE,'$bar',2,10) > ERROR: Error : VARIABLE EQUALS expression SEMI . > LexToken(VARIABLE,'$bar',2,10) > Syntax error on line 2: $bar > > You see it is in state 16, which you can look up in parser.out: > > state 16 > > (1) statement -> VARIABLE EQUALS expression SEMI . > > $end reduce using rule 1 (statement -> VARIABLE EQUALS > expression SEMI .) > > You see it only accepts $end (end of input) in that state, meaning it > didn't expect more input after the first ';' character, which is exactly why > you need the 'statements' symbol and corresponding parsing rules. > > For more information, consult documentation on ply, original yacc and/or > LALR(1) parsers. > > Hope this helps, > Dennis > > > > > comsatcat wrote: > > Dennis, > > Thank you for the feedback... I've no real experience using lex/yacc so > this is a new experience for me :) > > I got it working adding the two definitions you suggested, but I'm not sure > I understand whats totally going on, so I was hoping you could elaborate. > > From what I get from looking at the code... > > It defaults to the first processing statement encountered. > By adding the two definitions you mentioned below, I'm essentially saying > "the data can have multiple statements and statements of statements". What > I'm not understanding is the path the parser takes after matching the > statement > > If I am following correctly it should look like this (from top to bottom): > > statement = derefernce(statements) > expression = dereference(statement) > variable/number = dereference(expression) > > I understand it's much more complex then that, but in a nutshell thats the > general path its following? > > Thanks in advance, > Ben > > > > On Thu, Feb 12, 2009 at 1:27 AM, D.Hendriks (Dennis) <[email protected]>wrote: > >> >> Hello comsatcat, >> >> You defined parser rules but did not specify the start symbol, meaning >> it gets to be 'statement', because that's the first one (I think). >> Statement matches the part '$foo = 1;' after which the statement has >> been matched. The grammar doesn't expect anything after that. You could >> add something like this, before any other parsing rules: >> >> def p_statements_1(p): >> ''' >> statements : statement >> ''' >> pass # or something else... >> >> >> def p_statements_2(p): >> ''' >> statements : statements statement >> ''' >> pass # or something else... >> >> >> Or you could at it after/between the other rules, and explicitly define >> the start symbol. >> >> Dennis >> >> >> comsatcat wrote: >> > I'm playing with Ply, my input file is as follows: >> > >> > $foo = 1; >> > $bar = 3; >> > >> > My problem is, when I pass multiple lines to the parser, it errors out >> > after the first statement at $bar... >> > >> > <code> >> > >> > #!/usr/bin/env python >> > >> > import os, sys >> > import ply.lex as lex >> > import ply.yacc as yacc >> > >> > tables = {} >> > >> > reserved = { >> > 'if' : 'IF', >> > 'else' : 'ELSE', >> > 'elsif' : 'ELSIF', >> > 'while' : 'WHILE', >> > 'for' : 'FOR', >> > 'print' : 'PRINT' >> > } >> > >> > tokens = [ >> > 'VARIABLE', 'NUMBER', >> > 'PLUS', 'MINUS', 'TIMES', 'DIVIDE', 'EQUALS', >> > 'LPAREN', 'RPAREN', 'SEMI', 'COMMENT', >> > 'LT', 'GT', 'ET', 'LE', 'GE' >> > ] >> > tokens += list(reserved.values()) >> > t_LT = r'<' >> > t_GT = r'>' >> > t_ET = r'==' >> > t_LE = r'<=' >> > t_GE = r'>=' >> > t_SEMI = r';' >> > t_PLUS = r'\+' >> > t_MINUS = r'-' >> > t_TIMES = r'\*' >> > t_DIVIDE = r'/' >> > t_EQUALS = r'=' >> > t_LPAREN = r'\(' >> > t_RPAREN = r'\)' >> > >> > t_ignore = ' \t' >> > >> > def t_COMMENT(t): >> > r'\#.*' >> > pass >> > >> > def t_VARIABLE(t): >> > r'\$[a-zA-Z]{1,}' >> > t.type = reserved.get(t.value, 'VARIABLE') >> > return t >> > >> > def t_NUMBER(t): >> > r'[0-9]+' >> > t.value = int(t.value) >> > return t >> > >> > def t_newline(t): >> > r'\n' >> > t.lexer.lineno += 1 >> > >> > def t_error(t): >> > print "Illegal character '%s'" % t.value[0] >> > t.lexer.skip(1) >> > >> > lexer = lex.lex() >> > >> > precedence = ( >> > ('left', 'PLUS','MINUS'), >> > ('left', 'TIMES','DIVIDE') >> > ) >> > >> > def p_statement_assignment(p): >> > ''' >> > statement : VARIABLE EQUALS expression SEMI >> > ''' >> > tables[p[1].replace("$", "")] = p[3] >> > print "statement_assignment" >> > >> > def p_statement_expression(p): >> > ''' >> > statement : expression >> > ''' >> > print "statement_expression" >> > p[0] = p[1] >> > >> > def p_expression(p): >> > ''' >> > expression : expression PLUS expression >> > | expression MINUS expression >> > | expression TIMES expression >> > | expression DIVIDE expression >> > ''' >> > print "expression" >> > if p[2] == '+': >> > p[0] = p[1] + p[3] >> > elif p[2] == '-': >> > p[0] = p[1] - p[3] >> > elif p[2] == '*': >> > p[0] = p[1] * p[3] >> > elif p[2] == '/': >> > p[0] = p[1] / p[3] >> > >> > def p_expression_number(p): >> > ''' >> > expression : NUMBER >> > ''' >> > print "expression_number" >> > p[0] = p[1] >> > >> > def p_expression_variable(p): >> > ''' >> > expression : VARIABLE >> > ''' >> > print "expression_variable" >> > try: >> > p[0] = tables[p[1].replace("$", "")] >> > except IndexError: >> > print "Cannot find variable" >> > pass >> > >> > def p_error(p): >> > if not p: >> > print "Syntax error: premature end of file" >> > else: >> > print "Syntax error on line %d: %s" % (p.lineno, p.value) >> > >> > parser = yacc.yacc() >> > >> > def run(): >> > data = open(sys.argv[1]).read() >> > parser.parse(data) >> > >> > if __name__ == "__main__": >> > if len(sys.argv) != 2: >> > sys.exit(0) >> > >> > run() >> > print str(tables) >> > >> > </code> >> > >> > Does anyone see what I'm doing wrong here? >> > > >> > >> >> >> > > > > > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "ply-hack" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/ply-hack?hl=en -~----------~----~----~----~------~----~------~--~---
