[Felix-language] lexer and parser comments

Erick Tryzelaar Sat, 24 Mar 2007 20:19:13 -0800

Playing around with the parser for the first time, and I got some 
comments. First, I parse should return an opt instead of an anonymous 
union. See here, on line 52:


http://felix.sourceforge.net/doc/tutorial/introduction/en_flx_tutorial_0070.html

    51:   match z with
    52:   | case 0 => { print "Error"; }
    53:   | case 1 (?i) => { print i; }
    54:   endmatch;

I think it would be a little more consistent if it used the option type.

Next, I think that the nonterminal syntax should be changed to be more 
regular. Currently it's this:

    35: nonterm eexpr : expr_t =
    36: | xx:eexpr TOK_PLUS y:TOK_INT =>
    37:   match xx with
    38:   | Integr ?i => Integr (i+y)
    39:   endmatch
    40:
    41: | y:TOK_INT => Integr y
    42: ;

But shouldn't it be:

    35: nonterm eexpr : expr_t =
    36: | ?xx:eexpr TOK_PLUS ?y:TOK_INT =>
    37:   match xx with
    38:   | Integr ?i => Integr (i+y)
    39:   endmatch
    40:
    41: | ?y:TOK_INT => Integr y
    42: ;

To match the standard pattern matching syntax?

Third, can you explain this usage of repeating tokens? What is the 
second component of the "case 1 (?j,_)"?

    38: // a grammar for expressions
    39: nonterm eexpr : expr_t =
    40: | xx:eexpr TOK_PLUS y:TOK_INT+ =>
    41:   match xx with
    42:   | Integr ?i => let case 1 (?j,_) = y in Integr (i+j)
    43:   endmatch
    44:
    45: | y:TOK_INT => Integr y
    46: ;

The other parsers I've used return a list/array in these cases, but that 
doesn't seem to be the case here.


Fourth, I wrote up a basic wrapper around the lexing functions, called 
tokenize:

gen tokenize[T] (lex:iterator->iterator->iterator*T) (eof:T) (var 
s:string) (): T = {
  val first = Lexer::start_iterator s;
  val finish = Lexer::end_iterator s;
  var current = first;
start:>
  if current == finish do
    goto stop;
  done;
    val next, tok = lex current finish;
    current = next;
    yield tok;
  goto start;
stop:>
  return eof;
}


Would this be useful to add to the stdlib? Here's an example:

########################################################
union token_t =
  | TOK_EOF
  | TOK_DOT
  | TOK_PLUS
  | TOK_DIGIT of int
;

fun lexit(start:iterator) (finish:iterator):iterator*token_t =>
  reglex start to finish with
  | "+" => TOK_PLUS
  | "." => TOK_DOT
  | ["0"-"9"]+ => TOK_DIGIT $ int $ string_between(lexeme_start, lexeme_end)
  endmatch
;

union expr_t =
  | Flt of float
;

nonterm eexpr : expr_t =
  | xx:eexpr TOK_PLUS yy:floatexpr =>
    match xx, yy with
    | Flt ?i, Flt ?j => Flt (i+j)
   endmatch

  | y:floatexpr => y
;

nonterm floatexpr : expr_t =
  | x:TOK_DIGIT TOK_DOT y:TOK_DIGIT => Flt $ float(str(x) + '.' + str(y))
;

proc try_parse() {
  var z : 1 + float =
    //parse get_token "11.11+22.22+33.33" with
    parse tokenize (the lexit) TOK_EOF "11.11+22.22+33.33" with
    | e: eexpr => match e with | Flt ?i => i endmatch
    endmatch
  ;

  match z with
  | case 0 => { print "Error"; }
  | case 1 (?i) => { print i; }
  endmatch;
  endl;
}

try_parse();
########################################################

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Felix-language mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/felix-language

[Felix-language] lexer and parser comments

Reply via email to