On Wed, Jan 4, 2012 at 11:00 AM, Vincent Zweije <vinc...@zweije.nl> wrote:

> I may have hit a bug in the lemon parser generator.
>

Please see if the following fix clears your problem:

    http://www.sqlite.org/src/info/ce32775b23



>
> It looks like lookahead symbols aren't propagated properly, in some cases.
>
> The consequence is that in some states, valid symbols are rejected with
> a syntax error.
>
> The complete grammar can be found at
>
>    http://www.zweije.nl/~vzweije/parser.y
>
> I'm processing the grammar with lemon version 3.7.7-2ubuntu2. I've
> compared that version to the current head in sqlite, and it only differs
> in non-essential areas (calls to fclose recently added in trunk, calls
> to snprintf added in ubuntu). I'm pretty sure the current head version
> of lemon will exhibit the problem as well.
>
> For completeness' sake, this was found on an armv7l system.
>
> I have just reproduced the problem with the head lemon.c on an i386
> system. Valgrind reports no stray memory accesses (though quite a few
> memory leaks, but those are probably harmless).
>
> The command to process the parser is:
>
> lemon -c -s parser.y
>
> The generated parser has, among others, the following state:
>
> State 244:
>          constructor_def ::= existental_quant_variables_opt upper_name *
> fix_opt type_clos
>    (200) fix_opt ::= *
>          fix_opt ::= * Fix
>
>                          Open reduce 200
>                     CurlyOpen reduce 200
>                          Hash reduce 200
>                      RectOpen reduce 200
>                             A reduce 200
>                           Dot reduce 200
>                       Exclaim reduce 200
>                          Star reduce 200
>                           Int reduce 200
>                          Real reduce 200
>                          Char reduce 200
>                          Bool reduce 200
>      RectOpenExclaimRectClose reduce 200
>                       Dynamic reduce 200
>                          File reduce 200
>                        String reduce 200
>                         World reduce 200
>                           Fix shift  486
>                   LowerCaseId reduce 200
>                   UpperCaseId reduce 200
>                       FunnyId reduce 200
>                       fix_opt shift  130
>
> As can be seen, fix_opt is nullable - it may derive the empty string.
> Therefore, any symbol that can start what follows it (type_clos),
> or any lookahead symbol (because type_clos is nullable too), must
> be acceptable in this state. However, some tokens (Bar, Semicolon,
> LowerCaseId, UpperCaseId, FunnyId) are not.
>
> The next state lists the acceptable symbols there:
>
> State 130:
>          type_clos ::= * open_type_clos
>          type_clos ::= * closed_type_clos
>          open_type_clos ::= * closed_type_clos type_prefix_seq
>          closed_type_seq ::= * closed_type_clos type_prefix_clos
> type_expression
>          closed_type_clos ::= * closed_type_seq
>    (144) closed_type_clos ::= *
>          constructor_def ::= existental_quant_variables_opt upper_name
> fix_opt * type_clos
>
>                          Open reduce 144
>                     CurlyOpen reduce 144
>                           Bar reduce 144
>                          Hash reduce 144
>                      RectOpen reduce 144
>                             A reduce 144
>                           Dot reduce 144
>                       Exclaim reduce 144
>                          Star reduce 144
>                           Int reduce 144
>                          Real reduce 144
>                          Char reduce 144
>                          Bool reduce 144
>      RectOpenExclaimRectClose reduce 144
>                       Dynamic reduce 144
>                          File reduce 144
>                        String reduce 144
>                         World reduce 144
>                   LowerCaseId reduce 144
>                   UpperCaseId reduce 144
>                       FunnyId reduce 144
>                     Semicolon reduce 144
>                     type_clos shift  563
>               closed_type_seq shift  375
>              closed_type_clos shift  179
>                open_type_clos shift  491
>
> Some of the missing symbols (LowerCaseId, UpperCaseId, FunnyId) are
> starters of type_clos:
>
>  164: type_clos: Open CurlyOpen Hash RectOpen A Dot Exclaim Star Int Real
> Char Bool RectOpenExclaimRectClose Dynamic File String World LowerCaseId
> UpperCaseId FunnyId
>
> The other two (Bar, Semicolon) are not; these are followers of the
> constructor_def symbol.
>
> Looking at it now, <lambda> is not in the starters of type_clos,
> though according to the grammar rules, it is obviously nullable
> (type_clos => closed_type_close => <lambda>). Curious.
>
> The problem disappears when fix_opt is deleted from the RHS of the
> rule. (Of course, this means that a present Fix token is no longer
> accepted.)
>
> Ciao.                                                          Vincent.
> --
> Vincent Zweije <vinc...@zweije.nl>   | "If you're flamed in a group you
> <http://www.xs4all.nl/~zweije/>      | don't read, does anybody get
> burnt?"
> [Xhost should be taken out and shot] |            -- Paul Tomblin on a.s.r.
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
D. Richard Hipp
d...@sqlite.org
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to