> the AST API strawman - given the positive discussions on this list, I > thought the idea was implicitly accepted last year, modulo details, > so I was surprised not to see a refined strawman promoted.
It hasn't really been championed so far. I was concentrating on other proposals for ES.next. > - it does not support generic traversals, so it definitely needs a > pre-implemented traversal, sorting out each type of Node > (Array-based ASTs, like the es-lab version, make this slightly > easier - Arrays elements are ordered, unlike Object properties); I designed it to be easily JSON-{de}serializable, so no special prototype. However, you can use the builder API to construct your own format: https://developer.mozilla.org/en/SpiderMonkey/Parser_API#Builder_objects With a custom builder you can create objects with whatever methods you want, and builders for various formats can be shared in libraries. > at that stage, simple applications (such as tag generation) > may be better of working with hooks into the parser, rather > than hooks into an AST traversal? also, there is the risk that > one pre-implemented traversal might not cover all use cases, > in which case the boilerplate tax would have to be paid again; I don't understand any of this. > - it is slightly easier to manipulate than an Array-based AST, but More than slightly, IMO. > lack of pattern matching fall-through (alternative patterns for > destructuring) still hurts, and the selectors are lengthy, which > hampers visualization and construction; (this assumes that > fp-style AST processing is preferred over oo-style processing) If I'd defined a new object type with its own prototype, it still wouldn't define all operations anyone would ever want. So they'd either have to monkey-patch it or it would need a visitor. Which you could write anyway. So I don't see much benefit to pre-defining a node prototype. But again, see the builder API, where you can create your own custom node type. > - it is biased towards evaluation, which is a hindrance for other > uses (such as faithful unparsing, for program transformations); It's just a reflection of the built-in SpiderMonkey parser, which was designed for the sole purpose of evaluation. I didn't reimplement a new parser. > this can be seen clearly in Literals, which are evaluated (why > not evaluate Object, Array, Function Literals as well? eval should > be part of AST processing, not of AST construction), but it also > shows in other constructs (comments are not stored at all, and > if commas/semicolons are not stored, how does one know > where they were located - programmers tend to be picky > about their personal or project-wide style guides?); None of this data is available in a SpiderMonkey parse node. > - there are some minor oddities, from spelling differences to > the spec (Label(l)ed), Heh, I shouldn't've capitulated to my (excellent and meticulous!) reviewer, who was unfamiliar with the spec: https://bugzilla.mozilla.org/show_bug.cgi?id=533874#c28 I can probably change that. > to structuring decisions (why separate > UpdateExpression and LogicalExpression, when everything > else is in UnaryExpression and BinaryExpression?); I separated update expressions and logical expressions because they have different control structure from the other unary and binary operators. > btw, why alternate/consequent instead of then/else, and I was avoiding using keywords as property names, and consequent/alternate are standard terminology. I suppose .then/.else would be more convenient. > shouldn't that really be consequent->then and alternate->else > instead of the other way round (as the optional null for > consequent suggests)? Doc bug, thanks. Fixed. > My main issue is unparsing support for program transformations https://bugzilla.mozilla.org/show_bug.cgi?id=590755 > (though IDEs will similarly need more info, for comment extraction, > syntax highlighting, and syntax-based operations). This is all the stuff that will almost certainly require separate implementations from the engine's core parser. And maybe that's fine. In my case, I wanted to implement a reflection of our existing parser, because it's guaranteed to track the behavior of SpiderMonkey's parser. > What I did for now was to add a field to each Node, in which I > store an unprocessed Array of the sub-ASTs, including tokens. > Essentially, the extended AST Nodes provide both abstract info > for analysis and evaluation and a structured view of the token > stream belonging to each Node, for lower-level needs. > > Whitespace/comments are stored separately, indexed by the > start position of the following token (this is going to work better > for comment-before-token that for comment-after-token, but it > is a start, for unparsing or comment-extraction tools). You've lost me again. Are you describing a parser you wrote? > This allows for a generic traversal of the Array-based unprocessed > AST fragments, for unparsing, but I still have to rearrange things > so that I can actually store the information I need (can't add info > to null as an AST value) and distinguish meta-info ("computed" > and "prefix" properties) from sub-ASTs. I'm still lost. > Overall, the impression is that this AST was designed by someone > resigned to the fact of having to write Node-type-specific traversal > code for each purpose, with a limited number of purposes planned > (such as evaluation). This could be a burden for other uses of such > ASTs (boilerplate tax). It was designed to be minimal and serializable. It was a lot of code, so I figured I would just focus on a) making sure all the data was there and b) making it possible to provide a custom data format via the builder API. This is what I came up with, but I can revisit the API design if it's useful. > I hope these notes help - I'd really like to see a standard JS > parser API implemented across engines. For language > experimentation, we'd still need separate tweakable parsers, > but access to the efficient engine parsers for current JS would > give tool development a boost. I'm still not convinced this is such a big win. Reflect.parse gives you *some* performance, but it still requires two traversals (one to generate the internal C++ JSParseNode tree and then a second to convert that to a JS object tree). But part of the benefit is knowing you have exactly the SpiderMonkey parser. Once implementors have to write a separate parser, the possibility of divergence increases, and the maintenance cost of building a second parser in a low-level language is high. At that point, they might just want to write it in JS. But anybody could do that. >> But there are also tough questions about what the parser >> should do with engine-specific language extensions. > > Actually, that starts before the AST: I'd like to see feature-based > language versioning, instead of the current monolithic version > numbering - take generators as an example feature: > > Perhaps JS1.7 ("javascript;version=1.7") happens to be the first > JS version to support "yield", and is backwards compatible with > JS1.5, which might happen to match ES3; and JS1.8.5, which > happens to match ES5, might be backwards compatible with > JS1.7. But it is unlikely that the JSx which happens to match ES6 > will be backwards compatible with JS1.7 (while ES5-breaking > changes will be limited, replacing experimental JS1.x features > with standardized variants is another matter). > > Whereas, if I was able to specify "use yield", and be similarly > selective about other language features, then either of JS1.7, > JS1.8.5 and ES6 engines might be able to do the job, depending > on what other language features my code depends on. Also, > other engines might want to implement some features -like > "yield"- selectively, without aiming to support all of JS1.7, and > long before being able to support all of ES6. That's asking for quite a modularized/configurable parser. >> I agree about the issue of multiple parsers. The reason I >> was able to do the SpiderMonkey library fairly easily was >> that I simply reflect exactly the parser that exists. But to >> have a standards-compliant parser, we'd probably have >> to write a separate parser. That's definitely a tall order. > > It should not be, provided one distinguishes between > standards-compliant and production use. If the ES grammar > is LR(1), it should really be specified in a parser tool format, Mainstream production JS engines have moved away from parser generators. > both for verification and to generate standards-compliant > tools to compare against. Depending on how efficient the > JS Bison implementation is, this might even lead to useable > parser performance. Again, this could be implemented by anyone as a pure JS library. > There may be problems in finding a tool that generates all > the information needed for a useful AST (source locations, > comments, scope info, ..), but we do not need to solve every > issue immediately to make progress, right? And if the ES > committee were to ask ES parser generator implementors > whether their tools could be extended to serve an AST spec, > response might be favourable. > > It would be nice if the spec parser was generated in Javascript, > but any tool-usable standard grammar would be useful - once > the grammar can be processed by a freely available tool, it can > be translated to similar formats, some of which have Javascript > implementations (eg Jison, ANTLR). > > Having played a little with the ANTLRWorks environment, it > looks promising, is easy to install (just a .jar), has user-contributed > ES grammars, and can spot some ambiguities easily (though > I don't think its check is complete, and the ES grammar is too > complex to make naïve parse-tree visualization helpful). If other > tools have better ES grammar development support, I'd like to > hear about them. > > Without a standard spec-conformant tool-readable grammar, > such tools remain of limited use. With a tool-readable grammar, > adding AST generation might turn out to be an afternoon's work > (followed by years of testing/debugging;-). A standard, machine-processable grammar would be a nice-to-have. Agreed. I hate to complain, but can you try to trim your messages? It takes an enormous amount of time to read and respond to these huge messages. https://twitter.com/#!/statpumpkin/status/66187260407709696 Dave _______________________________________________ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss