Re: [racket-users] Towards an Incremental Racket Parser for better IDE experience?

Sam Tobin-Hochstadt Wed, 09 Dec 2020 11:24:30 -0800

Hi Nicolas,

I do want to encourage you to keep thinking about this stuff -- some
of the things I described are definitely doable and would be
interesting projects, even if re-writing the entire expander is a big
task.


Sam

On Wed, Dec 9, 2020 at 12:46 PM nicobao <[email protected]> wrote:
>
> Hi all,
>
> I've read with great attention your messages, especially Sam's very 
> comprehensive answer.
> I now clearly understand that it's a research-level work, and I was 
> definitely too ambitious in trying to dig into that - as I have limited time 
> after my day job (and probably too limited knowledge too, but that's a 
> non-problem as it wouldn't be a one-person task anyway).
> Nevertheless, I really appreciate the exchange.
>
> Kind regards,
> Nicolas
>
> On Wednesday, December 2, 2020 at 7:56:58 PM UTC+1 Sam Tobin-Hochstadt wrote:
>>
>> A few thoughts on these topics, which I've been thinking about for a while.
>>
>> First, let's distinguish two things. One is an _incremental_ system,
>> such as a parser, which is one which does less work in response to a
>> small change than it would need to do from scratch. The other is a
>> system with _error recovery_, which is one where in the presence of
>> one error, the system can still provide a useful answer and/or
>> continue on to discover other errors. tree-sitter, for example, aims
>> to do both of these, but they're quite different.
>>
>> With that in mind, several points:
>>
>> 1. It would be relatively straightforward to build an incremental
>> _reader_ -- going from text to s-expressions. You could start from the
>> grammar here: 
>> https://github.com/racket/parser-tools/blob/master/parser-tools-lib/parser-tools/examples/read.rkt
>> which is just for Scheme, and the lexer here:
>> https://github.com/racket/syntax-color/blob/master/syntax-color-lib/syntax-color/racket-lexer.rkt
>> which is for full Racket, which as Robby says is already
>> error-tolerant. The read syntax (in the absence of reader extensions)
>> is definitely context-free and probably LR(1). The code for the reader
>> is here: 
>> https://github.com/racket/racket/tree/master/racket/src/expander/read
>>
>> However, just calling `read` from scratch every time isn't a big
>> bottleneck -- the biggest Racket-syntax file I have around is about
>> 86000 lines and takes 700ms to `read`.
>>
>> 2. As Robby points out, the big challenge is the macro expander, which
>> is (a) not a grammar, (b) large and complicated (the code is here:
>> https://github.com/racket/racket/tree/master/racket/src/expander and
>> it's about 35k lines) and (c) it runs arbitrary Racket code in the
>> form of macros. I'm definitely interested in thinking about what an
>> incremental expander would look like, but that's a big research
>> project and probably would require a different model of macros than
>> Racket has right now. It would not work to use some existing parsing
>> toolkit like tree-sitter. You could perhaps write a new macro expander
>> using an incremental computation framework such as Adapton
>> [https://docs.rs/adapton/0.3.31/adapton/] or write something like
>> Adapton for Racket. How well that would work is an interesting
>> question. You could also rewrite the macro expander to be incremental
>> more directly.
>>
>> 3. An error-tolerant macro expander is more plausible, but would again
>> require substantial changes to the expander. One possible idea is to
>> use the information the macro stepper already uses to reconstruct the
>> partial program right before it went wrong, and supply that to the IDE
>> to use for completion/etc. Another idea would be to replace pieces of
>> erroneous syntax with something that allows the expander to continue
>> (this is how error-tolerant parsers work). There are probably lots
>> more ideas that we could come up with.
>>
>> 4. Compiling to one of the OCaml intermediate languages is an
>> interesting idea -- I've thought about their flambda language as a
>> possible target before. The place to start is the `schemify` layer:
>> https://github.com/racket/racket/tree/master/racket/src/schemify that
>> turns fully-expanded Racket code into Scheme code for Chez Scheme.
>> Changing that to produce flambda would be plausible, although there
>> are a lot of mismatches between the languages that would be tricky to
>> overcome. Another possibility would be to directly produce JavaScript
>> from that layer. You might be interested in the RacketScript project:
>> https://github.com/vishesh/racketscript
>>
>> If you're interested in thinking more about these topics, or working
>> on them, I'm happy to offer more advice.
>>
>> Sam
>>
>> On Wed, Dec 2, 2020 at 9:53 AM nicobao <[email protected]> wrote:
>> >
>> > Hi!
>> >
>> > The Racket Reader and the Racket Expander always return "Error : blabla" 
>> > when you send it a bad Racket source code.
>> > As a consequence, when there is a source code error, DrRacket and the 
>> > Racket LSP cannot provide IDE functionalities like "find references", 
>> > "info on hover", "find definition"...etc.
>> > This is an issue, because 99% of the time one write code, the code is 
>> > incorrect. Other languages (Rust, Typescript/JS, Java, OCaml...etc) rely 
>> > on an incremental parser than can provide a tree even if the source code 
>> > is wrong. Basically it adds an "ERROR" node in the tree, and go on instead 
>> > of stopping everything and returning at the first error.
>> > Currently this compiler issue is blocking the Racket IDE to provide better 
>> > user experience.
>> > For my practical use case of Racket, it is important.
>> >
>> > I would like to help working towards that direction.
>> > I see two possible solutions to that:
>> > 1) improve the recursive descent parser of the Reader, as well as the 
>> > Expander to make them incremental and fault-tolerant
>> > 2) re-writing the parser in something like tree-sitter or Menhir, at the 
>> > cost of having to re-write the Reader/Expander logic (!!!)
>> >
>> > Both solutions are daunting tasks.
>> >
>> > For solution 1), could you point me to the Racket's recursive descent 
>> > parser source code? What about the Expander ?
>> >
>> > For solution 2), I was thinking of writing a tree-sitter grammar for 
>> > racket. However, I can't find a formal description of the grammar, like 
>> > Scheme did here:
>> > https://www.scheme.com/tspl4/grammar.html#APPENDIXFORMALSYNTAX
>> > Of course, the Racket documentation is still quite comprehensive, but it 
>> > would be nice if anyone could tell me if there is such formal document 
>> > somewhere?
>> > Besides, I wonder whether Racket/Scheme could even be described using a 
>> > LR(1) or a GLR grammar?
>> >
>> > Finally, is any work have been started towards this direction?
>> >
>> > Totally off-topic, but has anyone ever thought of compiling Racket down to 
>> > OCaml, in order to reuse js_of_ocaml and produce optimized JS code from 
>> > Racket?
>> > I was wondering whether it would be feasible.
>> >
>> > Final note: I know all of that is _very_ ambitious!
>> >
>> > Kind regards,
>> > Nicolas
>> >
>> > --
>> > You received this message because you are subscribed to the Google Groups 
>> > "Racket Users" group.
>> > To unsubscribe from this group and stop receiving emails from it, send an 
>> > email to [email protected].
>> > To view this discussion on the web visit 
>> > https://groups.google.com/d/msgid/racket-users/d77440e3-1876-44e5-b52b-323d5715df66n%40googlegroups.com.
>
> --
> You received this message because you are subscribed to the Google Groups 
> "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/racket-users/1a94fbdb-69f5-4c29-9dcf-4349de89ac16n%40googlegroups.com.

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/racket-users/CAK%3DHD%2BZ7xt9%3DCRkNpeo1fJ6eh%2B9umLP12B%2B_hLtv4Yw9jvrRcQ%40mail.gmail.com.

Re: [racket-users] Towards an Incremental Racket Parser for better IDE experience?

Reply via email to