I'm no parser expert, but a parser that was easier to understand and modify, and was as fast as the current one, sounds good to me.
It's a tricky area though; e.g. the layout rule. Worth talking to Simon Marlow. Simon | -----Original Message----- | From: ghc-devs <ghc-devs-boun...@haskell.org> On Behalf Of Vladislav | Zavialov | Sent: 08 October 2018 21:44 | To: ghc-devs <ghc-devs@haskell.org> | Subject: Parser.y rewrite with parser combinators | | Hello devs, | | Recently I've been working on a couple of parsing-related issues in | GHC. I implemented support for the -XStarIsType extension, fixed | parsing of the (!) type operator (Trac #15457), allowed using type | operators in existential contexts (Trac #15675). | | Doing these tasks required way more engineering effort than I expected | from my prior experience working with parsers due to complexities of | GHC's grammar. | | In the last couple of days, I've been working on Trac #1087 - a | 12-year old parsing bug. After trying out a couple of approaches, to | my dismay I realised that fixing it properly (including support for | bang patterns inside infix constructors, etc) would require a complete | rewrite of expression and pattern parsing logic. | | Worse yet, most of the work would be done outside Parser.y in Haskell | code instead, in RdrHsSyn helpers. When I try to keep the logic inside | Parser.y, in every design direction I face reduce/reduce conflicts. | | The reduce/reduce conflicts are the worst. | | Perhaps it is finally time to admit that Haskell syntax with all of | the GHC cannot fit into a LALR grammar? | | The extent of hacks that we have right now just to make parsing | possible is astonishing. For instance, we have dedicated constructors | in HsExpr to make parsing patterns possible (EWildPat, EAsPat, | EViewPat, ELazyPat). That is, one of the fundamental types (that the | type checker operates on) has four additional constructors that exist | due to a reduce/reduce conflict between patterns and expressions. | | I propose a complete rewrite of GHC's parser to use recursive descent | parsing with monadic parser combinators. | | 1. We could significantly simplify parsing logic by doing things in a | more direct manner. For instance, instead of parsing patterns as | expressions and then post-processing them, we could have separate | parsing logic for patterns and expressions. | | 2. We could fix long-standing parsing bugs like Trac #1087 because | recursive descent offers more expressive power than LALR (at the cost | of support for left recursion, which is not much of a loss in | practice). | | 3. New extensions to the grammar would require less engineering effort. | | Of course, this rewrite is a huge chunk of work, so before I start, I | would like to know that this work would be accepted if done well. | Here's what I want to achieve: | | * Comparable performance. The new parser could turn out to be faster | because it would do less post-processing, but it could be slower | because 'happy' does all the sorts of low-level optimisations. I will | consider this project a success only if comparable performance is | achieved. | | * Correctness. The new parser should handle 100% of the syntactic | constructs that the current parser can handle. | | * Error messages. The new error messages should be of equal or better | quality than existing ones. | | * Elegance. The new parser should bring simplification to other parts | of the compiler (e.g. removal of pattern constructors from HsExpr). | And one of the design principles is to represent things by dedicated | data structures, in contrast to the current state of affairs where we | represent patterns as expressions, data constructor declarations as | types (before D5180), etc. | | Let me know if this is a good/acceptable direction of travel. That's | definitely something that I personally would like to see happen. | | All the best, | - Vladislav | _______________________________________________ | ghc-devs mailing list | ghc-devs@haskell.org | https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmail.hask | ell.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fghc- | devs&data=02%7C01%7Csimonpj%40microsoft.com%7C19181de5c6bd493ab07a08d | 62d5edbe0%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636746282778542095 | &sdata=lFRt1t4k3BuuRdyOqwOYTZcLPRB%2BtFJwfFtgMpNLxW0%3D&reserved= | 0 _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs