On Sat, 2007-03-24 at 21:19 -0700, Erick Tryzelaar wrote:
> Next, I think that the nonterminal syntax should be changed to be more
> regular. Currently it's this:
>
> 35: nonterm eexpr : expr_t =
> 36: | xx:eexpr TOK_PLUS y:TOK_INT =>
> 37: match xx with
> 38: | Integr ?i => Integr (i+y)
> 39: endmatch
> 40:
> 41: | y:TOK_INT => Integr y
> 42: ;
>
> But shouldn't it be:
>
> 35: nonterm eexpr : expr_t =
> 36: | ?xx:eexpr TOK_PLUS ?y:TOK_INT =>
> 37: match xx with
> 38: | Integr ?i => Integr (i+y)
> 39: endmatch
> 40:
> 41: | ?y:TOK_INT => Integr y
> 42: ;
>
> To match the standard pattern matching syntax?
Possibly. The thing is that argument applies to functions too:
fun f(?x:int)=> x + x;
makes sense, in fact we'd love to allow arbitrary patterns as arguments:
indeed OCAML DOES.
Unfortunately Felix also has to cope with lvalues and
ref/var/val distinctions which kind of mess this idea up a bit.
> Third, can you explain this usage of repeating tokens? What is the
> second component of the "case 1 (?j,_)"?
>
> 38: // a grammar for expressions
> 39: nonterm eexpr : expr_t =
> 40: | xx:eexpr TOK_PLUS y:TOK_INT+ =>
> 41: match xx with
> 42: | Integr ?i => let case 1 (?j,_) = y in Integr (i+j)
> 43: endmatch
> 44:
> 45: | y:TOK_INT => Integr y
> 46: ;
Sure .. it's a real list, not one of the nominally types ones
from the library module List. The type of a list is
(1 + t * list) as list
so the second component (case 1) has an argument of type
t * list
In this case t is int, the _ here is the tail of the list.
> The other parsers I've used return a list/array in these cases, but that
> doesn't seem to be the case here.
Yep, it is a list.
>
> Fourth, I wrote up a basic wrapper around the lexing functions, called
> tokenize:
>
> gen tokenize[T] (lex:iterator->iterator->iterator*T) (eof:T) (var
> s:string) (): T = {
> val first = Lexer::start_iterator s;
> val finish = Lexer::end_iterator s;
> var current = first;
> start:>
> if current == finish do
> goto stop;
> done;
> val next, tok = lex current finish;
> current = next;
> yield tok;
> goto start;
> stop:>
> return eof;
> }
>
>
> Would this be useful to add to the stdlib? Here's an example:
Ah, interesting .. you used a yield, which didn't exist originally.
Probably would be useful, yes.
Only nasty is the 'eof' value .. an alternative form would return
an option type. This isn't the same as setting T to the option
type because the actual lexer is using a non-option, so you'd
be saying
yield (Some tok)
..
return None
in the alternative. you could do that too, call it
gen maybe_tokenize // dang yank spellingz ..
The 'eof' version has the appeal that many token variants already
contain an 'end' token, and, many streams have the end token
already in them, so you'd already be testing it, and the
'return eof' would never be executed.
--
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Felix-language mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/felix-language