Re: [I] formatting the AST while preserving the source location information from the original query [datafusion-sqlparser-rs]

2025-01-11 Thread via GitHub


alamb commented on issue #1634:
URL: 
https://github.com/apache/datafusion-sqlparser-rs/issues/1634#issuecomment-2585357782

   > That would be great ! But with the current (incorrect) span information, 
we would end up with syntactically invalid sql.
   
   This seems like it is a reason to focus on improving the span accuracy 
(which is valuable for other reasons too) first before trying to also carry 
along the original tokens


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [I] formatting the AST while preserving the source location information from the original query [datafusion-sqlparser-rs]

2025-01-10 Thread via GitHub


graup commented on issue #1634:
URL: 
https://github.com/apache/datafusion-sqlparser-rs/issues/1634#issuecomment-2584354845

   I'm playing around with a different approach for editing parts of the AST 
that works more like tools like eslint work. The idea is that given accurate 
source spans, why don't we just edit the original query directly without 
round-tripping the whole AST?
   
   The steps would be: 1. parse query into AST. 2. Visit the AST, looking for 
nodes we want to replace. 3. Construct and render a new AST node for local 
replacement. 4. Use the old node's span() to replace the new text right in the 
source.
   
   This way we preserve all the other formatting outside of the node we're 
replacing, without any work on the AST displaying logic.
   
   Of course, this needs accurate and complete spans which we don't have yet, 
but sounds feasible to me.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [I] formatting the AST while preserving the source location information from the original query [datafusion-sqlparser-rs]

2025-01-04 Thread via GitHub


lovasoa commented on issue #1634:
URL: 
https://github.com/apache/datafusion-sqlparser-rs/issues/1634#issuecomment-2571427562

   That could work, but it's not necessarily a lot less work than a Display 
implementation that respects the Span information, and the result would be 
wasteful and less generally useful to other users of sqlparser...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [I] formatting the AST while preserving the source location information from the original query [datafusion-sqlparser-rs]

2025-01-04 Thread via GitHub


alamb commented on issue #1634:
URL: 
https://github.com/apache/datafusion-sqlparser-rs/issues/1634#issuecomment-2571421058

   > I think there is a misunderstanding. My use case is: the user enters a 
query. I need to parse it, potentially edit the ast, and then stringify it back 
before feeding it to the database. When the database returns an error, the 
error contains position information. This position information references the 
stringified AST we fed to the database, not the original string entered by the 
user. Currently, there is no way to map that position in the output of Display 
back to a position in the original string
   
   I see the challenge -- you have to potentially edit the AST
   
   I wonder if you could do something  (crazy) like 
   1. re-parse the stringified AST to get span information of the `Display` 
format
   2. Walk the original AST and the re-parsed AST  using the spans  to create a 
mapping back to the original query
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [I] formatting the AST while preserving the source location information from the original query [datafusion-sqlparser-rs]

2025-01-04 Thread via GitHub


lovasoa commented on issue #1634:
URL: 
https://github.com/apache/datafusion-sqlparser-rs/issues/1634#issuecomment-2571399681

   > Specifically, to preserve the original SQL for error messages you could 
also keep the actual original SQL string somewhere and on error construct the 
error message using the orignal SQL (and the locations in that text that are 
now in Spans)
   
   I think there is a misunderstanding. My use case is: the user enters a 
query. I need to parse it, potentially edit the ast, and then stringify it back 
before feeding it to the database. When the database returns an error, the 
error contains position information. This position information references the 
stringified AST we fed to the database, not the original string entered by the 
user. Currently, there is no way to map that position in the output of Display 
back to a position in the original string.
   
   > As you hint in this description, trying to preserve the original 
formatting would likely be a very large undertaking (and thus I suspect 
practically infeasible given the amount of engineering we have available)
   
   That's true. If I'm the only one with that need, then it's probably not 
worth it. But maybe I'm not ? And the work does not have to be exhaustive to be 
useful. I've already done it just for respecting newlines and indentation in 
`UPDATE` statements (because the syntax is simpler than SELECT) in 
https://github.com/apache/datafusion-sqlparser-rs/pull/1636. I think improving 
upon it and doing something similar at least for newlines around keywords and 
identifier lists in SELECT .. FROM ... WHERE would already be quite useful.
   
   Regarding resources: @freshtonic , if this is useful to you, do you think 
this is something cipherstash could dedicate a few engineering hours to ? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [I] formatting the AST while preserving the source location information from the original query [datafusion-sqlparser-rs]

2025-01-04 Thread via GitHub


alamb commented on issue #1634:
URL: 
https://github.com/apache/datafusion-sqlparser-rs/issues/1634#issuecomment-2571390419

   I don't understand the usecase for preserving the original formatting after 
`Display` the AST.
   
   Specifically, to preserve the original SQL for error messages you could also 
keep the actual original SQL string somewhere and on error construct the error 
message using the orignal SQL (and the locations that are now in Spans)
   
   I think you could take that approach today without any additional work.
   
   As you hint in this description, trying to preserve the original formatting 
would likely be a very large undertaking (and thus I suspect practically 
infeasible given the amount of engineering we have available)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [I] formatting the AST while preserving the source location information from the original query [datafusion-sqlparser-rs]

2025-01-04 Thread via GitHub


freshtonic commented on issue #1634:
URL: 
https://github.com/apache/datafusion-sqlparser-rs/issues/1634#issuecomment-2571263201

   @lovasoa 👋  I'm interested in this


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]