Re: Problem with processing more then one statement

comsatcat Fri, 13 Feb 2009 09:17:36 -0800

Dennis,

Thank you for all the feedback.  I RTFM'ed a lot and have a better
understanding of how ply works.


I am running into one more problem, I'm having trouble figuring out how to
handle looping events?  The only thing I can think of is building a stack of
the lines of code within the loops, then run that through a new instance of
yacc for each itteration of the loop.

Could you or someone else please elaborate how I would handle looping
symantex in ply?

Thanks,
Ben

On Thu, Feb 12, 2009 at 3:44 AM, D.Hendriks (Dennis) <[email protected]>wrote:

> Hello Ben,
>
> If you don't specify the start symbol, the symbol name of the first rule
> (in the file) is used. In the original case, it was 'statement', because the
> first rule in the file is 'statement : VARIABLE EQUALS expression SEMI'. At
> least that is what I think happens (if I remember correctly from reading the
> PLY documentation). I always explicitly supply the start symbol to the yacc
> method. See the PLY documentation for more information. If you take a look
> at the generated parser.out file, you see:
>
> Grammar
>
> Rule 1     S' -> statement
> Rule 2     statement -> VARIABLE EQUALS expression SEMI
> Rule 3     statement -> expression
> Rule 4     expression -> expression PLUS expression
> Rule 5     expression -> expression MINUS expression
> Rule 6     expression -> expression TIMES expression
> Rule 7     expression -> expression DIVIDE expression
> Rule 8     expression -> NUMBER
> Rule 9     expression -> VARIABLE
>
> meaning that indeed 'statement' is the start symbol, looking at the actual
> start symbol S'.
>
> PLY generates parsing tables. When you supply input to the parse method,
> PLY will try to parse it. It will try to 'match' it to the start symbol. So,
> it will try to match a 'statement'. If you want to see what happens
> internally, supply debug=2 as parameter to the parse method. Also, you may
> want to check out the parser.out file to see the parse tables information in
> human readable format. If you look at the debug output (with debug=2), you
> get (I use PLY 3.0, but you get something similar in PLY 2.5):
>
> PLY: PARSE DEBUG START
>
> State  : 0
> Stack  : . LexToken(VARIABLE,'$foo',1,0)
> Action : Shift and goto state 3
>
> State  : 3
> Stack  : VARIABLE . LexToken(EQUALS,'=',1,5)
> Action : Shift and goto state 5
>
> State  : 5
> Stack  : VARIABLE EQUALS . LexToken(NUMBER,1,1,7)
> Action : Shift and goto state 1
>
> State  : 1
> Stack  : VARIABLE EQUALS NUMBER . LexToken(SEMI,';',1,8)
> Action : Reduce rule [expression -> NUMBER] with [1] and goto state 7
> expression_number
> Result : <int @ 0x813ef18> (1)
>
> State  : 11
> Stack  : VARIABLE EQUALS expression . LexToken(SEMI,';',1,8)
> Action : Shift and goto state 16
>
> State  : 16
> Stack  : VARIABLE EQUALS expression SEMI . LexToken(VARIABLE,'$bar',2,10)
> ERROR: Error  : VARIABLE EQUALS expression SEMI .
> LexToken(VARIABLE,'$bar',2,10)
> Syntax error on line 2: $bar
>
> You see it is in state 16, which you can look up in parser.out:
>
> state 16
>
>     (1) statement -> VARIABLE EQUALS expression SEMI .
>
>     $end            reduce using rule 1 (statement -> VARIABLE EQUALS
> expression SEMI .)
>
> You see it only accepts $end (end of input) in that state, meaning it
> didn't expect more input after the first ';' character, which is exactly why
> you need the 'statements' symbol and corresponding parsing rules.
>
> For more information, consult documentation on ply, original yacc and/or
> LALR(1) parsers.
>
> Hope this helps,
> Dennis
>
>
>
>
> comsatcat wrote:
>
> Dennis,
>
> Thank you for the feedback... I've no real experience using lex/yacc so
> this is a new experience for me :)
>
> I got it working adding the two definitions you suggested, but I'm not sure
> I understand whats totally going on, so I was hoping you could elaborate.
>
> From what I get from looking at the code...
>
> It defaults to the first processing statement encountered.
> By adding the two definitions you mentioned below, I'm essentially saying
> "the data can have multiple statements and statements of statements".  What
> I'm not understanding is the path the parser takes after matching the
> statement
>
> If I am following correctly it should look like this (from top to bottom):
>
> statement = derefernce(statements)
> expression = dereference(statement)
> variable/number = dereference(expression)
>
> I understand it's much more complex then that, but in a nutshell thats the
> general path its following?
>
> Thanks in advance,
> Ben
>
>
>
> On Thu, Feb 12, 2009 at 1:27 AM, D.Hendriks (Dennis) <[email protected]>wrote:
>
>>
>> Hello comsatcat,
>>
>> You defined parser rules but did not specify the start symbol, meaning
>> it gets to be 'statement', because that's the first one (I think).
>> Statement matches the part '$foo = 1;' after which the statement has
>> been matched. The grammar doesn't expect anything after that. You could
>> add something like this, before any other parsing rules:
>>
>> def p_statements_1(p):
>>    '''
>>    statements : statement
>>    '''
>>    pass # or something else...
>>
>>
>> def p_statements_2(p):
>>    '''
>>    statements : statements statement
>>    '''
>>    pass # or something else...
>>
>>
>> Or you could at it after/between the other rules, and explicitly define
>> the start symbol.
>>
>> Dennis
>>
>>
>> comsatcat wrote:
>> > I'm playing with Ply, my input file is as follows:
>> >
>> > $foo = 1;
>> > $bar = 3;
>> >
>> > My problem is, when I pass multiple lines to the parser, it errors out
>> > after the first statement at $bar...
>> >
>> > <code>
>> >
>> > #!/usr/bin/env python
>> >
>> > import os, sys
>> > import ply.lex as lex
>> > import ply.yacc as yacc
>> >
>> > tables = {}
>> >
>> > reserved = {
>> >             'if' : 'IF',
>> >             'else' : 'ELSE',
>> >             'elsif' : 'ELSIF',
>> >             'while' : 'WHILE',
>> >             'for' : 'FOR',
>> >             'print' : 'PRINT'
>> > }
>> >
>> > tokens = [
>> >           'VARIABLE', 'NUMBER',
>> >           'PLUS', 'MINUS', 'TIMES', 'DIVIDE', 'EQUALS',
>> >           'LPAREN', 'RPAREN', 'SEMI', 'COMMENT',
>> >           'LT', 'GT', 'ET', 'LE', 'GE'
>> > ]
>> > tokens += list(reserved.values())
>> > t_LT = r'<'
>> > t_GT = r'>'
>> > t_ET = r'=='
>> > t_LE = r'<='
>> > t_GE = r'>='
>> > t_SEMI = r';'
>> > t_PLUS = r'\+'
>> > t_MINUS = r'-'
>> > t_TIMES = r'\*'
>> > t_DIVIDE = r'/'
>> > t_EQUALS = r'='
>> > t_LPAREN = r'\('
>> > t_RPAREN = r'\)'
>> >
>> > t_ignore = ' \t'
>> >
>> > def t_COMMENT(t):
>> >     r'\#.*'
>> >     pass
>> >
>> > def t_VARIABLE(t):
>> >     r'\$[a-zA-Z]{1,}'
>> >     t.type = reserved.get(t.value, 'VARIABLE')
>> >     return t
>> >
>> > def t_NUMBER(t):
>> >     r'[0-9]+'
>> >     t.value = int(t.value)
>> >     return t
>> >
>> > def t_newline(t):
>> >     r'\n'
>> >     t.lexer.lineno += 1
>> >
>> > def t_error(t):
>> >     print "Illegal character '%s'" % t.value[0]
>> >     t.lexer.skip(1)
>> >
>> > lexer = lex.lex()
>> >
>> > precedence = (
>> >               ('left', 'PLUS','MINUS'),
>> >               ('left', 'TIMES','DIVIDE')
>> >             )
>> >
>> > def p_statement_assignment(p):
>> >     '''
>> >     statement : VARIABLE EQUALS expression SEMI
>> >     '''
>> >     tables[p[1].replace("$", "")] = p[3]
>> >     print "statement_assignment"
>> >
>> > def p_statement_expression(p):
>> >     '''
>> >     statement : expression
>> >     '''
>> >     print "statement_expression"
>> >     p[0] = p[1]
>> >
>> > def p_expression(p):
>> >     '''
>> >     expression : expression PLUS expression
>> >                | expression MINUS expression
>> >                | expression TIMES expression
>> >                | expression DIVIDE expression
>> >     '''
>> >     print "expression"
>> >     if p[2] == '+':
>> >         p[0] = p[1] + p[3]
>> >     elif p[2] == '-':
>> >         p[0] = p[1] - p[3]
>> >     elif p[2] == '*':
>> >         p[0] = p[1] * p[3]
>> >     elif p[2] == '/':
>> >         p[0] = p[1] / p[3]
>> >
>> > def p_expression_number(p):
>> >     '''
>> >     expression : NUMBER
>> >     '''
>> >     print "expression_number"
>> >     p[0] = p[1]
>> >
>> > def p_expression_variable(p):
>> >     '''
>> >     expression : VARIABLE
>> >     '''
>> >     print "expression_variable"
>> >     try:
>> >         p[0] = tables[p[1].replace("$", "")]
>> >     except IndexError:
>> >         print "Cannot find variable"
>> >         pass
>> >
>> > def p_error(p):
>> >     if not p:
>> >         print "Syntax error: premature end of file"
>> >     else:
>> >         print "Syntax error on line %d: %s" % (p.lineno, p.value)
>> >
>> > parser = yacc.yacc()
>> >
>> > def run():
>> >     data = open(sys.argv[1]).read()
>> >     parser.parse(data)
>> >
>> > if __name__ == "__main__":
>> >     if len(sys.argv) != 2:
>> >         sys.exit(0)
>> >
>> >     run()
>> >     print str(tables)
>> >
>> > </code>
>> >
>> > Does anyone see what I'm doing wrong here?
>> > >
>> >
>>
>>
>>
>
>
>
>
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"ply-hack" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/ply-hack?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: Problem with processing more then one statement

Reply via email to