On Fri, Jun 12, 2015 at 5:05 AM, Keean Schupke <[email protected]> wrote: > On 12 Jun 2015 09:52, "Matt Oliveri" <[email protected]> wrote: >> On Fri, Jun 12, 2015 at 4:18 AM, Keean Schupke <[email protected]> wrote: >> > On 12 Jun 2015 08:57, "Matt Oliveri" <[email protected]> wrote: >> >> Keean, the goal here is not to improve performance over the >> >> one-big-tagged-union option. It is indeed clear that you can't do >> >> better, if you want AST nodes to be useful. The goal is to know, >> >> statically, that the parsed AST will always conform to the constrained >> >> structure allowed by the original CFG. >> > >> > Can we not rely on being well-typed by construction? So if the parser >> > only outputs well typed fragments, that's enough? >> > Somewhere we need a specification of the grammar, >> > if you are using parser combinators we could >> > say the combinators are the grammar specification, so what they output >> > is always well typed. >> >> I don't think that's paranoid enough. How do we know the parser only >> outputs well-typed fragments? Even if we can agree on a set of >> combinators to use for defining the programming language in the first >> place, we still want some kind of evidence that the implementations of >> those combinators always produces the parser we think. >> >> And really, it doesn't seem like a good idea to use parser combinators >> instead of usual typing rules. This seems to _force_ us to combine >> parsing and type checking. Generally, for specification purposes, it's >> a good idea to split a complex thing into as many steps as possible, >> so that each step's meaning is maximally obvious. Implementations can >> be as yucky and clever and efficient as you please if you can verify >> it against the simple-minded spec. > > But a grammar, excluding the attributes, is the same as the parser > specification.
Right. And the typing rules, which are stated in terms of ASTs, not surface syntax, are the type checker spec. And they are separate. In both cases, the spec is _generative_ whereas the implementation is _analytic_. That is, a CFG tells you how to generate all the parseable inputs, whereas a parser takes an input and tests if it's parseable (and also does something with it, hopefully). Typing rules tell you how to generate all the well-typed program fragments (as ASTs), whereas a type checker takes an AST and tests if it's well-typed. PEGs are actually analytic, and I'd argue they're really no better than an abstract combinator API. So I thought you were saying that because a parser generates an AST, we could think of a parser, with integrated type checking, made with combinators, as a stand-in for a grammar and typing rules, since it's like a PEG that generates all well-typed program fragments. And I was trying to explain why this approach does not provide enough evidence that the programmer knows what they're doing. Aside from what I was saying about not keeping parsing and typing separate, and buggy combinator implementations, you have Shap's old complaint that with a PEG, you can't tell if the language is ambiguous. > For example we take an EBNF grammar specification, and write it using parser > combinators, or compile it into code. What else is there to check it > against? Since (I think) EBNF is generative, it's actually not a given that the combinators parse the same language that the grammar generates. You'd have to prove it. So that would be something to check even if you consider a type-checking, combinator-based parser to be an acceptable definition of a type system, which I don't. You can always take an implementation to be a spec, in which case it's correct by definition. But most implementations have bugs, so taking an implementation for a spec usually gets you a pretty stupid definition of correctness. _______________________________________________ bitc-dev mailing list [email protected] http://www.coyotos.org/mailman/listinfo/bitc-dev
