Re: Rust, Lexer Issues

Antonio Mon, 06 Mar 2023 09:49:40 -0800

Hi László,

Some comments inlined below.


On 6/3/23 17:23, László Kishalmi wrote:

Well, this one is mostly for Antonio.

At the beginning of the Rust, anyone thread you were hitting some assertion
errors in LexerInput.
I have not seen the stacktrace, but I'd guess that's due to the original
error handling in ANTLR.


Yep, at the beginning I hitted some AssertionErrors in LexerInput.

The problem appeared when I typed a quoted string when another posteriortoken string was present in the file. The Lexer (can't tell if Antlr orNetBeans) was getting confused because it was hitting the EOF token(looking for a second matching ") and it couldn't find one.

I solved the problem in the Lexer grammar by allowing the special tokenEOF to end STRING_LITERAL [1] RAW_STRING_LITERAL [2] andBYTE_STRING_LITERAL [3] (note the ('"' | EOF) construct there). Afterthat everything worked normally.

I think Antlr4 was getting confused because it couldn't detect the endof the token, and this unexpected EOF was causing trouble (andgenerating those AssertionErrors). Maybe we can detect these situationsin Antlr4Bridge, but I can't really tell.


Is that true? Is that the reason why:
https://github.com/apache/netbeans/blob/5934827e80797ec9b93ff28635ce7fca12627a70/rust/rust.grammar/src/org/netbeans/modules/rust/grammar/RustLanguageLexer.java#L80
has been put in code?

Not really, `lexer.removeErrorListeners()` (and`parser.removeErrorListeners()`) seems to be the standard practice inAntlr4: Antlr4 adds an error listener that logs errors to System.out,and I wanted to get rid of that for NetBeans.


I'm asking because I'm hitting the assertion in my recent HCL Lexer and
suspect the default ANTLR error handling as a source. So this just would
reassure me, and also can replace the default error handler in the
AbstractAntlrLexerBridge.

See [4] (the Rust AST building parser) and [5] (a custom Antlr4ErrorStrategy) for a more complete example of lexing/parsing and errorhandling.

The RustANTLRErrorStrategy [5] is throwing RecognitionExceptions onreportMissingToken, and this stops the parser from asking new tokens (asfar as I can tell) and better handles situations when the user is typingsomething and parsing cannot continue.


BTW, Thanks for moving the Rust effort!


Thank YOU for the Antlr4 Bridge, Rust support couldn't happen otherwise!

Kind regards,
Antonio


[1]
https://github.com/apache/netbeans/blob/master/rust/rust.grammar/src/org/netbeans/modules/rust/grammar/antlr4/g4/RustLexer.g4#L208

[2]
https://github.com/apache/netbeans/blob/master/rust/rust.grammar/src/org/netbeans/modules/rust/grammar/antlr4/g4/RustLexer.g4#L212

[3]
https://github.com/apache/netbeans/blob/master/rust/rust.grammar/src/org/netbeans/modules/rust/grammar/antlr4/g4/RustLexer.g4#L216

[4]
https://github.com/apache/netbeans/blob/master/rust/rust.grammar/src/org/netbeans/modules/rust/grammar/ast/RustAST.java#L151

[5]
https://github.com/apache/netbeans/blob/master/rust/rust.grammar/src/org/netbeans/modules/rust/grammar/ast/RustAST.java#L112

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@netbeans.apache.org
For additional commands, e-mail: dev-h...@netbeans.apache.org

For further information about the NetBeans mailing lists, visit:
https://cwiki.apache.org/confluence/display/NETBEANS/Mailing+lists

Re: Rust, Lexer Issues

Reply via email to