LucaCappelletti94 opened a new issue, #2076: URL: https://github.com/apache/datafusion-sqlparser-rs/issues/2076
Hi, At this time, whitespace tokens are stored in the parser, and are then filtered out in several distinct points in the parser logic, such as: * https://github.com/apache/datafusion-sqlparser-rs/blob/67684c84d4c2589356c411ea4917dcf1defcd77c/src/parser/mod.rs#L4032-L4049 * https://github.com/apache/datafusion-sqlparser-rs/blob/67684c84d4c2589356c411ea4917dcf1defcd77c/src/parser/mod.rs#L4055-L4069 * https://github.com/apache/datafusion-sqlparser-rs/blob/67684c84d4c2589356c411ea4917dcf1defcd77c/src/parser/mod.rs#L4077-L4094 * https://github.com/apache/datafusion-sqlparser-rs/blob/67684c84d4c2589356c411ea4917dcf1defcd77c/src/parser/mod.rs#L4149-L4160 * https://github.com/apache/datafusion-sqlparser-rs/blob/67684c84d4c2589356c411ea4917dcf1defcd77c/src/parser/mod.rs#L4183-L4202 and many more. SQL, as far as I know, is not a language that cares about spaces like Python - it should be safe to remove all concepts of whitespaces after [the tokenization process](https://github.com/apache/datafusion-sqlparser-rs/blob/67684c84d4c2589356c411ea4917dcf1defcd77c/src/tokenizer.rs#L937-L942) is complete, and this should: * Reduce memory requirements, as whitespace tokens would not be stored anymore * Significantly simplify parser logic by removing all of the whitespace-related logic from the parser * Move the parser closer to a streaming logic, but that will require many more PRs Since such a PR would require quite a bit of effort on my part, I would appreciate some feedback on it before moving forward with it. @iffyio do you happen to have any opinion regarding such a refactoring? Ciao, Luca -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
