Re: [PR] Remove Whitespace Tokens from Parser [datafusion-sqlparser-rs]

via GitHub Fri, 21 Nov 2025 22:30:05 -0800


iffyio commented on code in PR #2077:
URL: 
https://github.com/apache/datafusion-sqlparser-rs/pull/2077#discussion_r2552321018



##########
src/tokenizer.rs:
##########
@@ -896,14 +929,37 @@ impl<'a> Tokenizer<'a> {
             line: 1,
             col: 1,
         };
+        let mut prev_keyword = None;
+        let mut cs_handler = CopyStdinHandler::default();
 
         let mut location = state.location();
-        while let Some(token) = self.next_token(&mut state, buf.last().map(|t| 
&t.token))? {
-            let span = location.span_to(state.location());
+        while let Some(token) = self.next_token(
+            &mut location,
+            &mut state,
+            buf.last().map(|t| &t.token),
+            prev_keyword,
+            false,
+        )? {
+            if let Token::Word(Word { keyword, .. }) = &token {
+                if *keyword != Keyword::NoKeyword {
+                    prev_keyword = Some(*keyword);
+                }
+            }
 
+            let span = location.span_to(state.location());
+            cs_handler.update(&token);
             buf.push(TokenWithSpan { token, span });
-
             location = state.location();
+
+            if cs_handler.is_in_copy_from_stdin() {

Review Comment:
   I was thinking on the parser end, we have the delimeter context to know if 
whitespace is in play, if so the parsing code could manually inject the 
whitespace character between tokens as it consumes them?
   i.e. given two arbitrary tokens, by comparing their location we can tell if 
they're right next to each other, separated by one or more whitespaces or by 
one or more newlines
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Remove Whitespace Tokens from Parser [datafusion-sqlparser-rs]

Reply via email to