On 6 September 2014 03:35, Jonathan S. Shapiro <[email protected]> wrote:
> Is the tokenizer/parser phase distinction actually useful in modern front
> ends?
Unsure.
I often think it would be easier to write a parser in one go, perhaps in a
pattern-matching language.
Just making one up, because ML in this font is hard to read
*
:
read_expr : LazyString -> AstSource ()
read_expr('"':xs) = string('"', xs, string_builder())
read_expr("'":xs) = string("'", xs, string_builder())
...
string(end_token (n:rest) body) = case n of
end_token =>
emit_term(
StringToken(
body
.build()
))
'\\' => string_escaped(end_token, rest, body)
_ => do body.append(n)
string(end_token, rest, body)
like, using functions to enable parametrised state is fun.
> What value is it providing, and should it be preserved?
It's probably faster, at least where tokenising is a single DFA. Not sure
if parser performance is much
of a thing, though.
> Why use two algorithms (regular expression engine, parse algorithm) when
one
> will suffice?
Because a parser generator can easily generate both, I guess.
* of course I later realised I can change the font. Courier New is my
default font, but since I reply inline, gmail ignores that. I would much
rather gmail simply showed text emails in a fixed-width font. Gah, gmail
sucks, and I should use something better.
--
William Leslie
Notice:
Likely much of this email is, by the nature of copyright, covered under
copyright law. You absolutely MAY reproduce any part of it in accordance
with the copyright law of the nation you are reading this in. Any attempt
to DENY YOU THOSE RIGHTS would be illegal without prior contractual
agreement.
_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev