Re: [I] formatting the AST while preserving the source location information from the original query [datafusion-sqlparser-rs]
alamb commented on issue #1634: URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1634#issuecomment-2585357782 > That would be great ! But with the current (incorrect) span information, we would end up with syntactically invalid sql. This seems like it is a reason to focus on improving the span accuracy (which is valuable for other reasons too) first before trying to also carry along the original tokens -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] formatting the AST while preserving the source location information from the original query [datafusion-sqlparser-rs]
graup commented on issue #1634: URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1634#issuecomment-2584354845 I'm playing around with a different approach for editing parts of the AST that works more like tools like eslint work. The idea is that given accurate source spans, why don't we just edit the original query directly without round-tripping the whole AST? The steps would be: 1. parse query into AST. 2. Visit the AST, looking for nodes we want to replace. 3. Construct and render a new AST node for local replacement. 4. Use the old node's span() to replace the new text right in the source. This way we preserve all the other formatting outside of the node we're replacing, without any work on the AST displaying logic. Of course, this needs accurate and complete spans which we don't have yet, but sounds feasible to me. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] formatting the AST while preserving the source location information from the original query [datafusion-sqlparser-rs]
lovasoa commented on issue #1634: URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1634#issuecomment-2571427562 That could work, but it's not necessarily a lot less work than a Display implementation that respects the Span information, and the result would be wasteful and less generally useful to other users of sqlparser... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] formatting the AST while preserving the source location information from the original query [datafusion-sqlparser-rs]
alamb commented on issue #1634: URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1634#issuecomment-2571421058 > I think there is a misunderstanding. My use case is: the user enters a query. I need to parse it, potentially edit the ast, and then stringify it back before feeding it to the database. When the database returns an error, the error contains position information. This position information references the stringified AST we fed to the database, not the original string entered by the user. Currently, there is no way to map that position in the output of Display back to a position in the original string I see the challenge -- you have to potentially edit the AST I wonder if you could do something (crazy) like 1. re-parse the stringified AST to get span information of the `Display` format 2. Walk the original AST and the re-parsed AST using the spans to create a mapping back to the original query -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] formatting the AST while preserving the source location information from the original query [datafusion-sqlparser-rs]
lovasoa commented on issue #1634: URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1634#issuecomment-2571399681 > Specifically, to preserve the original SQL for error messages you could also keep the actual original SQL string somewhere and on error construct the error message using the orignal SQL (and the locations in that text that are now in Spans) I think there is a misunderstanding. My use case is: the user enters a query. I need to parse it, potentially edit the ast, and then stringify it back before feeding it to the database. When the database returns an error, the error contains position information. This position information references the stringified AST we fed to the database, not the original string entered by the user. Currently, there is no way to map that position in the output of Display back to a position in the original string. > As you hint in this description, trying to preserve the original formatting would likely be a very large undertaking (and thus I suspect practically infeasible given the amount of engineering we have available) That's true. If I'm the only one with that need, then it's probably not worth it. But maybe I'm not ? And the work does not have to be exhaustive to be useful. I've already done it just for respecting newlines and indentation in `UPDATE` statements (because the syntax is simpler than SELECT) in https://github.com/apache/datafusion-sqlparser-rs/pull/1636. I think improving upon it and doing something similar at least for newlines around keywords and identifier lists in SELECT .. FROM ... WHERE would already be quite useful. Regarding resources: @freshtonic , if this is useful to you, do you think this is something cipherstash could dedicate a few engineering hours to ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] formatting the AST while preserving the source location information from the original query [datafusion-sqlparser-rs]
alamb commented on issue #1634: URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1634#issuecomment-2571390419 I don't understand the usecase for preserving the original formatting after `Display` the AST. Specifically, to preserve the original SQL for error messages you could also keep the actual original SQL string somewhere and on error construct the error message using the orignal SQL (and the locations that are now in Spans) I think you could take that approach today without any additional work. As you hint in this description, trying to preserve the original formatting would likely be a very large undertaking (and thus I suspect practically infeasible given the amount of engineering we have available) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] formatting the AST while preserving the source location information from the original query [datafusion-sqlparser-rs]
freshtonic commented on issue #1634: URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1634#issuecomment-2571263201 @lovasoa 👋 I'm interested in this -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
