Hi Nicolas, I do want to encourage you to keep thinking about this stuff -- some of the things I described are definitely doable and would be interesting projects, even if re-writing the entire expander is a big task.
Sam On Wed, Dec 9, 2020 at 12:46 PM nicobao <[email protected]> wrote: > > Hi all, > > I've read with great attention your messages, especially Sam's very > comprehensive answer. > I now clearly understand that it's a research-level work, and I was > definitely too ambitious in trying to dig into that - as I have limited time > after my day job (and probably too limited knowledge too, but that's a > non-problem as it wouldn't be a one-person task anyway). > Nevertheless, I really appreciate the exchange. > > Kind regards, > Nicolas > > On Wednesday, December 2, 2020 at 7:56:58 PM UTC+1 Sam Tobin-Hochstadt wrote: >> >> A few thoughts on these topics, which I've been thinking about for a while. >> >> First, let's distinguish two things. One is an _incremental_ system, >> such as a parser, which is one which does less work in response to a >> small change than it would need to do from scratch. The other is a >> system with _error recovery_, which is one where in the presence of >> one error, the system can still provide a useful answer and/or >> continue on to discover other errors. tree-sitter, for example, aims >> to do both of these, but they're quite different. >> >> With that in mind, several points: >> >> 1. It would be relatively straightforward to build an incremental >> _reader_ -- going from text to s-expressions. You could start from the >> grammar here: >> https://github.com/racket/parser-tools/blob/master/parser-tools-lib/parser-tools/examples/read.rkt >> which is just for Scheme, and the lexer here: >> https://github.com/racket/syntax-color/blob/master/syntax-color-lib/syntax-color/racket-lexer.rkt >> which is for full Racket, which as Robby says is already >> error-tolerant. The read syntax (in the absence of reader extensions) >> is definitely context-free and probably LR(1). The code for the reader >> is here: >> https://github.com/racket/racket/tree/master/racket/src/expander/read >> >> However, just calling `read` from scratch every time isn't a big >> bottleneck -- the biggest Racket-syntax file I have around is about >> 86000 lines and takes 700ms to `read`. >> >> 2. As Robby points out, the big challenge is the macro expander, which >> is (a) not a grammar, (b) large and complicated (the code is here: >> https://github.com/racket/racket/tree/master/racket/src/expander and >> it's about 35k lines) and (c) it runs arbitrary Racket code in the >> form of macros. I'm definitely interested in thinking about what an >> incremental expander would look like, but that's a big research >> project and probably would require a different model of macros than >> Racket has right now. It would not work to use some existing parsing >> toolkit like tree-sitter. You could perhaps write a new macro expander >> using an incremental computation framework such as Adapton >> [https://docs.rs/adapton/0.3.31/adapton/] or write something like >> Adapton for Racket. How well that would work is an interesting >> question. You could also rewrite the macro expander to be incremental >> more directly. >> >> 3. An error-tolerant macro expander is more plausible, but would again >> require substantial changes to the expander. One possible idea is to >> use the information the macro stepper already uses to reconstruct the >> partial program right before it went wrong, and supply that to the IDE >> to use for completion/etc. Another idea would be to replace pieces of >> erroneous syntax with something that allows the expander to continue >> (this is how error-tolerant parsers work). There are probably lots >> more ideas that we could come up with. >> >> 4. Compiling to one of the OCaml intermediate languages is an >> interesting idea -- I've thought about their flambda language as a >> possible target before. The place to start is the `schemify` layer: >> https://github.com/racket/racket/tree/master/racket/src/schemify that >> turns fully-expanded Racket code into Scheme code for Chez Scheme. >> Changing that to produce flambda would be plausible, although there >> are a lot of mismatches between the languages that would be tricky to >> overcome. Another possibility would be to directly produce JavaScript >> from that layer. You might be interested in the RacketScript project: >> https://github.com/vishesh/racketscript >> >> If you're interested in thinking more about these topics, or working >> on them, I'm happy to offer more advice. >> >> Sam >> >> On Wed, Dec 2, 2020 at 9:53 AM nicobao <[email protected]> wrote: >> > >> > Hi! >> > >> > The Racket Reader and the Racket Expander always return "Error : blabla" >> > when you send it a bad Racket source code. >> > As a consequence, when there is a source code error, DrRacket and the >> > Racket LSP cannot provide IDE functionalities like "find references", >> > "info on hover", "find definition"...etc. >> > This is an issue, because 99% of the time one write code, the code is >> > incorrect. Other languages (Rust, Typescript/JS, Java, OCaml...etc) rely >> > on an incremental parser than can provide a tree even if the source code >> > is wrong. Basically it adds an "ERROR" node in the tree, and go on instead >> > of stopping everything and returning at the first error. >> > Currently this compiler issue is blocking the Racket IDE to provide better >> > user experience. >> > For my practical use case of Racket, it is important. >> > >> > I would like to help working towards that direction. >> > I see two possible solutions to that: >> > 1) improve the recursive descent parser of the Reader, as well as the >> > Expander to make them incremental and fault-tolerant >> > 2) re-writing the parser in something like tree-sitter or Menhir, at the >> > cost of having to re-write the Reader/Expander logic (!!!) >> > >> > Both solutions are daunting tasks. >> > >> > For solution 1), could you point me to the Racket's recursive descent >> > parser source code? What about the Expander ? >> > >> > For solution 2), I was thinking of writing a tree-sitter grammar for >> > racket. However, I can't find a formal description of the grammar, like >> > Scheme did here: >> > https://www.scheme.com/tspl4/grammar.html#APPENDIXFORMALSYNTAX >> > Of course, the Racket documentation is still quite comprehensive, but it >> > would be nice if anyone could tell me if there is such formal document >> > somewhere? >> > Besides, I wonder whether Racket/Scheme could even be described using a >> > LR(1) or a GLR grammar? >> > >> > Finally, is any work have been started towards this direction? >> > >> > Totally off-topic, but has anyone ever thought of compiling Racket down to >> > OCaml, in order to reuse js_of_ocaml and produce optimized JS code from >> > Racket? >> > I was wondering whether it would be feasible. >> > >> > Final note: I know all of that is _very_ ambitious! >> > >> > Kind regards, >> > Nicolas >> > >> > -- >> > You received this message because you are subscribed to the Google Groups >> > "Racket Users" group. >> > To unsubscribe from this group and stop receiving emails from it, send an >> > email to [email protected]. >> > To view this discussion on the web visit >> > https://groups.google.com/d/msgid/racket-users/d77440e3-1876-44e5-b52b-323d5715df66n%40googlegroups.com. > > -- > You received this message because you are subscribed to the Google Groups > "Racket Users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/racket-users/1a94fbdb-69f5-4c29-9dcf-4349de89ac16n%40googlegroups.com. -- You received this message because you are subscribed to the Google Groups "Racket Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/racket-users/CAK%3DHD%2BZ7xt9%3DCRkNpeo1fJ6eh%2B9umLP12B%2B_hLtv4Yw9jvrRcQ%40mail.gmail.com.

