A tiny grammar was implemented in both Antlr and Bison+Flex (shown below). Test files repeating two lines (shown below) were made in 6 different
sizes. One executable compiled with command line switch choosing either Antlr or Bison+Flex. One run with empty actions, one run with actions populated, to compare pure parsing with some actual work. Results: CPU time Peak Memory File Name File Size # modules #tokens Bison Antlr Bison Antlr Action bodies empty: source.v10m 460mb 10m 150m 28s 572k * source.v5m 230mb 5m 75m 15s 572k * source.v2.5m 115mb 2.5m 37m 7s 572k * source.v1m 46mb 1m 15m 2s 572k * source.v500k 23mb 500k 7.5m 1s 572k * source.v250k 11mb 250k 3.7m <1s 4s 572k 1.7g <----------- Action bodies populated: source.v250k 11mb 250k 3.7m 9s 13s 477m 1.7g <----------- * Antlr ran out of memory at 2gb Comments: 1. I expected the requirement that the entire file be resident in memory to be the memory glut. Surprise! Quick inspection suggests an initial tokenizing of the entire in-memory file consumes gobbs of memory, pushing a small footprint up to 1.7gb before releasing it. Only the smallest test file was under the runable 32 bit 2gb limit. Please fix!! 2. Speed is clearly slower than bison+flex, however empty actions don't make interesting programs. The test with actions enabled shows a 9s vs. 13s difference, considerable less than the empty action case. 3. If you've never setup bison+flex I have only one comment: !...@%$#. Two thumbs up for Antlr. Details: Vista 64, AMD opteron 2.4Ghz, 16gb ram Visual Studio 2008 Sp1 One exe file with both Antlr and Bison+Flex, targeting 32 bit Full Optimization (/Ox), Inline Any suitable (/Ob2), Favor Small Code (/Os) Versions: Antlr 3.2 Bison 2.4.2 LR(1) Flex 2.5.35 ------------------- Input file ----------------------------- module tiptop #(int p1=3, p2=4 ); endmodule ... repeat to the indicated number of modules ... ------------------- Antlr Grammar -------------------------- source_text : description ( description )* ; description : module_declaration ; module_declaration : module_ansi_header ENDMODULE ( ':' module_identifier )? { act_module(); } ; module_ansi_header : MODULE_KEYWORD module_identifier ( parameter_port_list )? ';' ; module_identifier : identifier ; parameter_port_list : '#' '(' parameter_port_declaration ( ',' parameter_port_declaration )* ')' | '#' '(' ')' ; parameter_port_declaration returns [void* node] scope { void* type; void* head; void* tail; } : data_type { $parameter_port_declaration::type = $data_type.node; $parameter_port_declaration::head=NULL; } list_of_param_assignments { $node = $parameter_port_declaration::head; } ; list_of_param_assignments : param_assignment ( ',' param_assignment )* ; param_assignment : parameter_identifier '=' constant_param_expression { act_param_assignment ( & $parameter_port_declaration::head, & $parameter_port_declaration::tail, $parameter_identifier.node, $parameter_port_declaration::type, $constant_param_expression.node ); } ; constant_param_expression returns [void* node] : constant_mintypmax_expression { $node = $constant_mintypmax_expression.node; } // | '$' ; constant_mintypmax_expression returns [void* node] : constant_expression { $node = $constant_expression.node; } ; // Deviate from LRM constant_expression returns [void* node] : expr { $node = $expr.node; } ; parameter_identifier returns [void* node] : identifier { $node = $identifier.node; } ; data_type returns [void* node] : integer_atom_type signing {$node=act_type($integer_atom_type.value,$signing.value);} | integer_atom_type {$node=act_type($integer_atom_type.value,-1);} ; expr returns [void* node] : NUMBER { $node = act_number( $NUMBER.text->chars ); } ; identifier returns [void* node] : SIMPLE_IDENTIFIER { $node = act_identifier( $SIMPLE_IDENTIFIER.text->chars ); } ; /*------------------------------------------------------------------ * LEXER RULES *------------------------------------------------------------------*/ integer_atom_type returns [int value] : TokByte {$value = TokByte;} | TokShortint {$value = TokShortint;} | TokInt {$value = TokInt;} | TokLongint {$value = TokLongint;} | TokInteger {$value = TokInteger;} | TokTime {$value = TokTime;} ; signing returns [int value] : TokSigned {$value= TokSigned;} | TokUnsigned {$value= TokUnsigned;} ; MODULE_KEYWORD : (( 'module' )|('macromodule') ) ; ENDMODULE : 'endmodule' ; SIMPLE_IDENTIFIER : ( 'a'..'z'|'A'..'Z'|'_' ) ( 'a'..'z'|'A'..'Z'|'_'|'0'..'9'|'$')* ; NUMBER : (DIGIT)+ ; WHITESPACE : ( '\t' | ' ' | '\r' | '\n'| '\u000C' )+ { $channel = HIDDEN; } ; fragment DIGIT : '0'..'9' ; ------------------- Bison Grammar -------------------------- %% source_text : description ; description : module_declaration | description module_declaration ; module_declaration : module_ansi_header TokEndmodule { act_module(); } | module_ansi_header TokEndmodule ':' module_identifier { act_module(); } ; module_ansi_header : TokModule module_identifier ';' | TokModule module_identifier parameter_port_list ';' ; module_identifier : identifier { $$ = $1; } ; parameter_port_list : '#' '(' parameter_port_list_recur ')' | '#' '(' ')' ; parameter_port_list_recur : parameter_port_declaration | parameter_port_list_recur ',' parameter_port_declaration ; parameter_port_declaration : parameter_port_declaration_scope data_type { $1.type = $2; $1.head = NULL; } list_of_param_assignments { $$ = $1.head; } ; parameter_port_declaration_scope : ; list_of_param_assignments : nil nil param_assignment /* FIX:: need LR(2) here */ | list_of_param_assignments ',' param_assignment ; param_assignment : parameter_identifier '=' constant_param_expression { act_param_assignment ( & $<scope1>-3.head, & $<scope1>-3.tail, $1, $<scope1>-3.type, $3 ); } ; constant_param_expression : constant_mintypmax_expression { $$ = $1; } // | '$' ; constant_mintypmax_expression : constant_expression { $$ = $1; } ; // Deviate from LRM constant_expression : expr { $$ = $1; } ; parameter_identifier : identifier { $$ = $1; } ; data_type : integer_atom_type signing { $$ = act_typeB($1,$2); } | integer_atom_type { $$ = act_typeB($1,-1); } ; expr : TokNumber { $$ = act_number( $1 ); } ; nil : /* empty */ ; identifier : TokIdentifier { $$ = act_identifier( $1 ); } ; integer_atom_type : TokByte { $$ = $1; } | TokShortint { $$ = $1; } | TokInt { $$ = $1; } | TokLongint { $$ = $1; } | TokInteger { $$ = $1; } | TokTime { $$ = $1; } ; signing : TokSigned { $$ = $1; } | TokUnsigned { $$ = $1; } ; %% --------------------------------------------------------------- List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups "il-antlr-interest" group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.