iffyio commented on code in PR #2077:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/2077#discussion_r2552321018
##########
src/tokenizer.rs:
##########
@@ -896,14 +929,37 @@ impl<'a> Tokenizer<'a> {
line: 1,
col: 1,
};
+ let mut prev_keyword = None;
+ let mut cs_handler = CopyStdinHandler::default();
let mut location = state.location();
- while let Some(token) = self.next_token(&mut state, buf.last().map(|t|
&t.token))? {
- let span = location.span_to(state.location());
+ while let Some(token) = self.next_token(
+ &mut location,
+ &mut state,
+ buf.last().map(|t| &t.token),
+ prev_keyword,
+ false,
+ )? {
+ if let Token::Word(Word { keyword, .. }) = &token {
+ if *keyword != Keyword::NoKeyword {
+ prev_keyword = Some(*keyword);
+ }
+ }
+ let span = location.span_to(state.location());
+ cs_handler.update(&token);
buf.push(TokenWithSpan { token, span });
-
location = state.location();
+
+ if cs_handler.is_in_copy_from_stdin() {
Review Comment:
I was thinking on the parser end, we have the delimeter context to know if
whitespace is in play, if so the parsing code could manually inject the
whitespace character between tokens as it consumes them?
i.e. given two arbitrary tokens, by comparing their location we can tell if
they're right next to each other, separated by one or more whitespaces or by
one or more newlines
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]