Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
github-actions[bot] closed pull request #1809: Add support for `GO` batch delimiter in SQL Server URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1809 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
github-actions[bot] commented on PR #1809: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1809#issuecomment-3766102654 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or this will be closed in 7 days. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc commented on PR #1809: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1809#issuecomment-3553134546 Still waiting on review 🤞 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
github-actions[bot] commented on PR #1809: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1809#issuecomment-3550350484 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or this will be closed in 7 days. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc commented on PR #1809: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1809#issuecomment-3313781701 Apologies for the (even longer!) delay on this PR. I have rebase on latest main and addressed the single remaining code comment discussion. I think this is finally ready for review @iffyio @alamb -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc commented on code in PR #1809:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2364501422
##
src/parser/mod.rs:
##
@@ -484,8 +488,9 @@ impl<'a> Parser<'a> {
}
let statement = self.parse_statement()?;
+// Treat batch delimiter as an end of statement, so no additional
statement delimiter expected here
Review Comment:
Done 👍
##
src/parser/mod.rs:
##
@@ -475,6 +475,10 @@ impl<'a> Parser<'a> {
if expecting_statement_delimiter && word.keyword ==
Keyword::END {
break;
}
+
+if expecting_statement_delimiter && word.keyword ==
Keyword::GO {
Review Comment:
Done 👍
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
iffyio commented on code in PR #1809:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2099256280
##
src/parser/mod.rs:
##
@@ -475,6 +475,10 @@ impl<'a> Parser<'a> {
if expecting_statement_delimiter && word.keyword ==
Keyword::END {
break;
}
+
+if expecting_statement_delimiter && word.keyword ==
Keyword::GO {
Review Comment:
Can we add a clarifying comment here, mentioning that for the `GO`
statement, it is not necessary that the preceeding statement ends with a
semicolon (ideally with a sql example)?
##
src/parser/mod.rs:
##
@@ -484,8 +488,9 @@ impl<'a> Parser<'a> {
}
let statement = self.parse_statement()?;
+// Treat batch delimiter as an end of statement, so no additional
statement delimiter expected here
Review Comment:
```suggestion
// The `GO` statement optionally ends with a semicolon
```
thinking of a this more straightforward description, it would not be obvious
to a reader unfamiliar with the statement that 'batch delimiter' is the same
thing in this context
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
iffyio commented on PR #1809: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1809#issuecomment-2896488108 The illustration makes a lot of sense to me, thanks @aharpervc! I think we should be able to go with the current idea in this PR. Could you take a look at the conflicts in the branch when you get the time and we can look to get it mergeable? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc commented on PR #1809: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1809#issuecomment-2894713954 > Not sure I necessarily agree with this sentiment. I think if there are constructs in a file that arent sql constructs, they should be stripped off from the file before parsing to the library. Parsing non-sql constructs would be out of scope for this library. I actually agree with your perspective here. The central question is, "is the `go` batch delimiter found in SQL files 'a SQL construct' or not"? (Then we have a separate question on the implementation, which is the options list I suggested above). For illustration, something I think we'd both agree is _not_ a SQL construct & therefore _not_ appropriate for this parse is [Template Syntax](https://learn.microsoft.com/en-us/ssms/template/templates-ssms). That seems pretty clearly something that should be filled in prior to expecting a SQL document to parse. For `go`, here's what I can say: | evidence | a SQL construct | not a SQL construct | |---|---|---| | [documented as a "statement" alongside other SQL statements](https://learn.microsoft.com/en-us/sql/t-sql/language-elements/sql-server-utilities-statements-go) & indistinguishable from a statement in code | ✅ | | | documented as ["commands that are not Transact-SQL statements"](https://learn.microsoft.com/en-us/sql/t-sql/language-elements/sql-server-utilities-statements-go) & "GO is not a Transact-SQL statement" | | ✅ | | [SQL Server's ManagedBatchParser parses it](https://learn.microsoft.com/en-us/dotnet/api/managedbatchparser.parser.setbatchdelimiter?view=sqlserver-2016) (actually, configurable (!)) | ✅ | | | Apps need to parse SQL to find `go`'s (this library or another) | ✅ | ✅ | -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
iffyio commented on PR #1809: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1809#issuecomment-2879231551 > I think it'd be useful read a whole file into a string, then pass it to this library for parsing, then process the AST's Not sure I necessarily agree with this sentiment. I think if there are constructs in a file that arent sql constructs, they should be stripped off from the file before parsing to the library. Parsing non-sql constructs would be out of scope for this library. As a result, similar to ideally avoiding changes to the core parsing loop, I think it would also be undesirable to change or introduce parser APIs in order to support this feature. My current thinking is that if supporting the feature is a low hanging fruit that doesn't introduce too much complexity to the codebase, then we could look to support it as a one-off. Beyond that scenario afaict, downstream users of the library are expected strip off GO lines from input files or do similar preprocessing, before passing input onto the library. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc commented on PR #1809:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1809#issuecomment-2874016995
> But actually I think if we can come up with something that doesn't touch
that core parser loop we might have something mergeable. I'll reopen this PR,
feel free to iterate if you have ideas to try out and I'm happy to take a look
when you do!
Sounds good... I will brainstorm on this 👍.
---
It may also be useful to clarify what is my overall objective... basically,
I think it'd be useful read a whole file into a string, then pass it to this
library for parsing, then process the AST's. Since SQL code I work with (and in
the industry) often includes `GO` batch delimiters in the file source code, and
this library doesn't support batch delimiters outside of this PR, that
presented a difficulty. Initially I was thinking that the smoothest approach
would be to add batch delimiter support directly into the parser. However, this
discussion is making me consider the pro/con more closely.
Something else I was thinking about is how the current "parse statements"
API _implies_ that you're only giving the parser sql of a _single batch_. Eg,
`single batch string` -> `Vec` makes complete sense in this model.
However this PR introduced a much more odd model, where a single string could
actually be multiple batches. Eg, `Vec` -> `Vec` is a
strange API.
So I think there could be a couple possible approaches going forward:
_Option 1: Embrace Batches_
We could introduce a Batch struct & new top level `String` -> `Vec`,
then `Batch` -> `Vec`. `Batch` could be like
```rust
struct Batch {
statements: Vec,
count: u16 // or whatever
}
```
This seems useful because it makes explicit what was formerly implicit.
Probably all the existing top level API's could be left alone, so all existing
users uninterested in Batches would be unaffected. Then we'd want something
like `parse_batches` to return a `Vec` and then do all the normal
parsing.
The downside is, it would introduce a new concept & ongoing maintenance when
perhaps there isn't a lot of interest. Also, not every dialect even has the
concept of a batch, I think, so that's a difficulty here too.
_Option 2: Reject Batches_
Close this PR permanently & open a new PR, that would basically be a
docs/readme update, to clarify what is the expected string users should provide
to parse_sql & similar API's. The clarification is it needs to be the contents
of a SQL batch, since that is the thing that can clearly be turned into a
single `Vec`. Dialects such as SQL Server which include batch
delimiters in the SQL source code should be processed _outside_ of this library
to split the batches, then the batch string can be given to sqlparser-rs.
The downside is back to that original objective, which is reading a whole
SQL file to a string then passing it to this library for parsing. If library
users have to do our own detection of the batch delimiter keyword, now we all
have to implement a mini parser before using the actual parser. That seems
really unfortunate.
_Option 3: Hold Batches out at arm's length_ (thesis, antithesis,
...synthesis)
I don't have a clear vision on this option yet, but it seems like this
library is well positioned to do _some_ work on this topic, but perhaps not as
much as full on Batch support. What if we had something like "parse_batches",
but it returned a simple `Vec`? I'm imagining something like:
```rust
fn split_batches -> Vec {
let mut result = vec![];
while let Some(token) = self.tokens.next() {
if peek_keyword(Keyword::GO) {
self.prev_token();
result.push(parse_go());
}
}
}
```
The concept here is the library has the ability to scan through the tokens &
parse the GO keyword, and split a single string into Vec but would
otherwise remain unchanged. Dialects that support batches could use this API to
turn a whole file string into something suitable for parse_statements. Other
dialects would remain unaffected.
---
I'd be interested to hear anyone else's opinion on this too, for/against any
option or to brainstorm other options.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
[PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc opened a new pull request, #1809: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1809 Reference: https://learn.microsoft.com/en-us/sql/t-sql/language-elements/sql-server-utilities-statements-go Lots of conventional SQL Server tooling supports `GO`, so it seems really useful & prudent to be able to parse it. I realize that strictly speaking, it isn't a SQL "statement". However, I think it makes sense to model here as a statement since it acts as one in all regards except documentation. In terms of parsing, `GO` has some peculiarities I tried to express with the new tests, such as with regards to position and basically treating newline as a delimiter. This also starts to nudge on https://github.com/apache/datafusion-sqlparser-rs/issues/1800, where newlines may possibly be interpreted as statement delimiters in a general case. However, my approach here for `GO` is isolated to only that statement, at least for the time being. --- Also, this will conflict with https://github.com/apache/datafusion-sqlparser-rs/pull/1808 when either is merged, because I'm adding things that end up being adjacent to each other which can confuse git, depending on merge strategy. There's no _actual_ conflict. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
iffyio commented on PR #1809: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1809#issuecomment-2869271634 > are you closing the PR because you're opposed to supporting batch delimiters at all in this library, or is it that you don't want to accept this particular approach @aharpervc ah to clarify I'm not opposed to supporting the feature itself - my thinking is mainly that the feature is out of scope for the parser since it is not a sql construct, so that supporting it would be at best a nice to have, then had it been easy to include that support we could do so. But afaict so far in this PR, that unfortunately doesn't seem to be the case because there's been a few special case code being introduced to the parser. Basically that the complexity being by the PR from my pov isn't worth the return given that this feature isn't meant to be supported by the parser in the first place. Maybe if the code is much simpler and touches less areas (most importantly I think the core parser loop I highlighted in the previous comment), we could merge such a patch, but I'm unable to say for sure what that looks like, code wise, from my end. But actually I think if we can come up with something that doesn't touch that core parser loop we might have something mergeable. I'll reopen this PR, feel free to iterate if you have ideas to try out and I'm happy to take a look when you do! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc commented on PR #1809: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1809#issuecomment-2868211237 > Closing this PR per my comment in [#1809 (comment)](https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2072719447) Thanks again @aharpervc for working on this and sorry for the confusion 🙏 I saw your comment but hadn't had a chance to get back to you. I'd like to get a clarification -- are you closing the PR because you're opposed to supporting batch delimiters _at all_ in this library, or is it that you don't want to accept this particular approach? If it's the latter, I'm completely open to continuing to iterate the branch. However if you don't want batch delimiters at all, that's disappointing -- it means this library will never be able to process real world SQL Server code, since tools on that stack all know how to process batch delimiters. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
iffyio commented on PR #1809: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1809#issuecomment-2868158991 Closing this PR per my comment in https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2072719447 Thanks again @aharpervc for working on this and sorry for the confusion :pray: -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
iffyio closed pull request #1809: Add support for `GO` batch delimiter in SQL Server URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1809 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
iffyio commented on code in PR #1809:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2072719447
##
src/parser/mod.rs:
##
@@ -475,6 +475,10 @@ impl<'a> Parser<'a> {
if expecting_statement_delimiter && word.keyword ==
Keyword::END {
break;
}
+
Review Comment:
@aharpervc Sorry for the delay getting back to this, actually I'm leaning
towards not merging this feature after all unfortunately. It feels to me like
there's a bit too many special cases (like this extending the core parsing loop
for example) for this feature that go against the normal of sql statement
parsing. And combined with this feature not being part of any sql reference but
rather a specific tool, it makes the tradeoff undesirable
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc commented on PR #1809: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1809#issuecomment-2840347205 @iffyio anything else on this one? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc commented on code in PR #1809:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2056934141
##
src/parser/mod.rs:
##
@@ -484,8 +488,18 @@ impl<'a> Parser<'a> {
}
let statement = self.parse_statement()?;
+expecting_statement_delimiter = match &statement {
+Statement::If(s) => match &s.if_block.conditional_statements {
+// the `END` keyword doesn't need to be followed by a
statement delimiter, so it shouldn't be expected here
+ConditionalStatements::BeginEnd { .. } => false,
+// parsing the statement sequence consumes the statement
delimiter, so it shouldn't be expected here
+ConditionalStatements::Sequence { .. } => false,
+},
+// Treat batch delimiter as an end of statement, so no
additional statement delimiter expected here
Review Comment:
I've simplified this to reduce it down to only concerns within this PR 👍
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc commented on code in PR #1809:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2056929658
##
src/parser/mod.rs:
##
@@ -484,8 +488,18 @@ impl<'a> Parser<'a> {
}
let statement = self.parse_statement()?;
+expecting_statement_delimiter = match &statement {
+Statement::If(s) => match &s.if_block.conditional_statements {
+// the `END` keyword doesn't need to be followed by a
statement delimiter, so it shouldn't be expected here
+ConditionalStatements::BeginEnd { .. } => false,
+// parsing the statement sequence consumes the statement
delimiter, so it shouldn't be expected here
+ConditionalStatements::Sequence { .. } => false,
+},
+// Treat batch delimiter as an end of statement, so no
additional statement delimiter expected here
Review Comment:
I suspect so... need to investigate. (I need to rebuild my mental model on
this first 😆).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc commented on code in PR #1809:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2056927589
##
src/parser/mod.rs:
##
@@ -618,6 +632,7 @@ impl<'a> Parser<'a> {
// `COMMENT` is snowflake specific
https://docs.snowflake.com/en/sql-reference/sql/comment
Keyword::COMMENT if self.dialect.supports_comment_on() =>
self.parse_comment(),
Keyword::PRINT => self.parse_print(),
+Keyword::GO => self.parse_go(),
Review Comment:
Thanks, that makes sense. I've updated it to rewind before parsing (using
raiserror as a reference)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc commented on code in PR #1809:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2056927872
##
src/parser/mod.rs:
##
@@ -3939,6 +3954,26 @@ impl<'a> Parser<'a> {
})
}
+/// Return nth previous token, possibly whitespace
+/// (or [`Token::EOF`] when before the beginning of the stream).
+pub fn peek_prev_nth_token_no_skip(&self, n: usize) -> TokenWithSpan {
Review Comment:
Done 👍
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc commented on code in PR #1809:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2056908951
##
src/dialect/mssql.rs:
##
@@ -116,7 +116,17 @@ impl Dialect for MsSqlDialect {
true
}
-fn is_column_alias(&self, kw: &Keyword, _parser: &mut Parser) -> bool {
+fn is_column_alias(&self, kw: &Keyword, parser: &mut Parser) -> bool {
+// if we find maybe whitespace then a newline looking backward, then
`GO` ISN'T a column alias
+// if we can't find a newline then we assume that `GO` IS a column
alias
+if kw == &Keyword::GO
+&& parser
+.expect_previously_only_whitespace_until_newline()
+.is_ok()
Review Comment:
Done 👍
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc commented on code in PR #1809:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2056886926
##
src/dialect/mssql.rs:
##
@@ -116,7 +116,17 @@ impl Dialect for MsSqlDialect {
true
}
-fn is_column_alias(&self, kw: &Keyword, _parser: &mut Parser) -> bool {
+fn is_column_alias(&self, kw: &Keyword, parser: &mut Parser) -> bool {
+// if we find maybe whitespace then a newline looking backward, then
`GO` ISN'T a column alias
+// if we can't find a newline then we assume that `GO` IS a column
alias
+if kw == &Keyword::GO
+&& parser
+.expect_previously_only_whitespace_until_newline()
+.is_ok()
Review Comment:
Gotcha, I will update it to do that
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc commented on code in PR #1809:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2056903407
##
src/parser/mod.rs:
##
@@ -4055,6 +4090,38 @@ impl<'a> Parser<'a> {
)
}
+/// Look backwards in the token stream and expect that there was only
whitespace tokens until the previous newline or beginning of string
+pub(crate) fn expect_previously_only_whitespace_until_newline(
+&mut self,
+) -> Result<(), ParserError> {
+let mut look_back_count = 1;
+loop {
+let prev_token = self.peek_prev_nth_token_no_skip(look_back_count);
+match prev_token.token {
+Token::EOF => break,
+Token::Whitespace(ref w) => match w {
+Whitespace::Newline => break,
+// special consideration required for single line comments
since that string includes the newline
+Whitespace::SingleLineComment { comment, prefix: _ } => {
+if comment.ends_with('\n') {
Review Comment:
> double checking: is there a scenario where a single line comment doesn't
end with a new line? spontaneously sounds like that should always hold true so
that the manual newline check would not be required
I actually don't know, I was surprised to find that the newline is actually
part of the comment text in this library. It seemed prudent to be defensive in
this new code, since that comment parsing behavior maybe isn't particularly
intentional.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc commented on PR #1809: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1809#issuecomment-2825528849 Here's another example case this PR should parse properly, before merging (on my todo list...) ``` USE some_database; GO ;WITH cte AS ( SELECT 1 x ) SELECT * FROM cte; ``` The peek logic in parse_go should be seeing that newline, but it's accidentally advancing past the whitespace, seeing the semi-colon, and failing. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
iffyio commented on code in PR #1809:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2056434400
##
src/parser/mod.rs:
##
@@ -484,8 +488,18 @@ impl<'a> Parser<'a> {
}
let statement = self.parse_statement()?;
+expecting_statement_delimiter = match &statement {
+Statement::If(s) => match &s.if_block.conditional_statements {
+// the `END` keyword doesn't need to be followed by a
statement delimiter, so it shouldn't be expected here
+ConditionalStatements::BeginEnd { .. } => false,
+// parsing the statement sequence consumes the statement
delimiter, so it shouldn't be expected here
+ConditionalStatements::Sequence { .. } => false,
+},
+// Treat batch delimiter as an end of statement, so no
additional statement delimiter expected here
Review Comment:
will this be replaced by similar solution as in #1808?
##
src/dialect/mssql.rs:
##
@@ -116,7 +116,17 @@ impl Dialect for MsSqlDialect {
true
}
-fn is_column_alias(&self, kw: &Keyword, _parser: &mut Parser) -> bool {
+fn is_column_alias(&self, kw: &Keyword, parser: &mut Parser) -> bool {
+// if we find maybe whitespace then a newline looking backward, then
`GO` ISN'T a column alias
+// if we can't find a newline then we assume that `GO` IS a column
alias
+if kw == &Keyword::GO
+&& parser
+.expect_previously_only_whitespace_until_newline()
+.is_ok()
Review Comment:
Ah I see, 2. sounds desirable to return a boolean, in that the function is
only checking if a condition holds, then interpretation of the condition is
left to the caller of the function. in the other usage the caller does e.g.
```rust
if !self.prev_whitespace_only_is_newline() {
return self.expected("newline ...")
}
```
##
src/parser/mod.rs:
##
@@ -3939,6 +3954,26 @@ impl<'a> Parser<'a> {
})
}
+/// Return nth previous token, possibly whitespace
+/// (or [`Token::EOF`] when before the beginning of the stream).
+pub fn peek_prev_nth_token_no_skip(&self, n: usize) -> TokenWithSpan {
Review Comment:
```suggestion
pub(crate) fn peek_prev_nth_token_no_skip(&self, n: usize) ->
&TokenWithSpan {
```
we can probably start with the API being non-public. Then can we switch to
returning a reference to avoid the mandatory cloning?
##
src/parser/mod.rs:
##
@@ -618,6 +632,7 @@ impl<'a> Parser<'a> {
// `COMMENT` is snowflake specific
https://docs.snowflake.com/en/sql-reference/sql/comment
Keyword::COMMENT if self.dialect.supports_comment_on() =>
self.parse_comment(),
Keyword::PRINT => self.parse_print(),
+Keyword::GO => self.parse_go(),
Review Comment:
Indeed the code is inconsistent in that regard historically, for newer
statements ideally the parser functions are self contained when possible
##
src/parser/mod.rs:
##
@@ -4055,6 +4090,38 @@ impl<'a> Parser<'a> {
)
}
+/// Look backwards in the token stream and expect that there was only
whitespace tokens until the previous newline or beginning of string
+pub(crate) fn expect_previously_only_whitespace_until_newline(
+&mut self,
+) -> Result<(), ParserError> {
+let mut look_back_count = 1;
+loop {
+let prev_token = self.peek_prev_nth_token_no_skip(look_back_count);
+match prev_token.token {
+Token::EOF => break,
+Token::Whitespace(ref w) => match w {
+Whitespace::Newline => break,
+// special consideration required for single line comments
since that string includes the newline
+Whitespace::SingleLineComment { comment, prefix: _ } => {
+if comment.ends_with('\n') {
Review Comment:
double checking: is there a scenario where a single line comment doesn't end
with a new line? spontaneously sounds like that should always hold true so that
the manual newline check would not be required
##
src/parser/mod.rs:
##
@@ -15064,6 +15117,38 @@ impl<'a> Parser<'a> {
}))
}
+/// Parse [Statement::Go]
+fn parse_go(&mut self) -> Result {
+self.expect_previously_only_whitespace_until_newline()?;
Review Comment:
Ah that makes sense! And the added comment clarifies as well thanks!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
--
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc commented on code in PR #1809:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2052788476
##
src/test_utils.rs:
##
@@ -166,6 +168,32 @@ impl TestedDialects {
only_statement
}
+/// The same as [`one_statement_parses_to`] but it works for a multiple
statements
+pub fn multiple_statements_parse_to(
Review Comment:
Done 👍
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc commented on code in PR #1809:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2052788247
##
src/parser/mod.rs:
##
@@ -15064,6 +15117,38 @@ impl<'a> Parser<'a> {
}))
}
+/// Parse [Statement::Go]
+fn parse_go(&mut self) -> Result {
+self.expect_previously_only_whitespace_until_newline()?;
+
+let count = loop {
+// using this peek function because we want to halt this statement
parsing upon newline
Review Comment:
Done 👍
##
src/parser/mod.rs:
##
@@ -4055,6 +4070,44 @@ impl<'a> Parser<'a> {
)
}
+/// Look backwards in the token stream and expect that there was only
whitespace tokens until the previous newline
+pub fn expect_previously_only_whitespace_until_newline(&mut self) ->
Result<(), ParserError> {
Review Comment:
Done 👍
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc commented on code in PR #1809:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2052785862
##
src/parser/mod.rs:
##
@@ -4055,6 +4070,44 @@ impl<'a> Parser<'a> {
)
}
+/// Look backwards in the token stream and expect that there was only
whitespace tokens until the previous newline
+pub fn expect_previously_only_whitespace_until_newline(&mut self) ->
Result<(), ParserError> {
Review Comment:
Done 👍
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc commented on code in PR #1809:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2052462208
##
src/parser/mod.rs:
##
@@ -15064,6 +15117,38 @@ impl<'a> Parser<'a> {
}))
}
+/// Parse [Statement::Go]
+fn parse_go(&mut self) -> Result {
+self.expect_previously_only_whitespace_until_newline()?;
Review Comment:
> hmm this call look equivalent to the loop below that peeks forward to skip
newlines? unclear why we're peeking backwards here right after we parsed the GO
keyword (so that this will always be infallible?)
The too loops aren't the same. The reason we need to peek look backwards
here is to forbid this syntax:
```sql
-- `GO` is disallowed
select 1; go
```
If you actually run this on a real SQL Server instance, you will also get a
similar error: `Incorrect syntax near 'go'.`. Therefore it seems prudent to
forbid this syntax from parsing here in this library. This situation has test
coverage with the `invalid_go_position` example.
---
Note that the extremely similar syntax `select 1 go` _is_ allowed, but in
this example `go` is a column alias rather than a batch delimiter.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc commented on code in PR #1809:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2052471264
##
src/dialect/mssql.rs:
##
@@ -116,7 +116,17 @@ impl Dialect for MsSqlDialect {
true
}
-fn is_column_alias(&self, kw: &Keyword, _parser: &mut Parser) -> bool {
+fn is_column_alias(&self, kw: &Keyword, parser: &mut Parser) -> bool {
+// if we find maybe whitespace then a newline looking backward, then
`GO` ISN'T a column alias
+// if we can't find a newline then we assume that `GO` IS a column
alias
+if kw == &Keyword::GO
+&& parser
+.expect_previously_only_whitespace_until_newline()
+.is_ok()
Review Comment:
In this case, it's most useful to know if there was previously only
whitespace until a newline. However in the other call of this function in
parse_go, we legitimately do want a parser error. Therefore we have choices:
1. expect_previously_only_whitespace_until_newline raises a parse error
which makes sense in parse_go but is converted to a boolean here
2. expect_previously_only_whitespace_until_newline returns a boolean which
makes sense here and then parse_go needs to separately raise a parse error
Which is the best choice for the time being?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc commented on code in PR #1809:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2052462208
##
src/parser/mod.rs:
##
@@ -15064,6 +15117,38 @@ impl<'a> Parser<'a> {
}))
}
+/// Parse [Statement::Go]
+fn parse_go(&mut self) -> Result {
+self.expect_previously_only_whitespace_until_newline()?;
Review Comment:
> hmm this call look equivalent to the loop below that peeks forward to skip
newlines? unclear why we're peeking backwards here right after we parsed the GO
keyword (so that this will always be infallible?)
The too loops aren't the same. The reason we need to peek look backwards
here is to forbid this syntax:
```sql
-- `GO` is disallowed
select 1; go
```
If you actually run this on a real SQL Server instance, you will also get a
similar error: `Incorrect syntax near 'go'.`. Therefore it seems prudent to
forbid this syntax from parsing here in this library.
---
Note that the extremely similar syntax `select 1 go` _is_ allowed, but in
this example `go` is a column alias rather than a batch delimiter.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc commented on code in PR #1809:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2052462208
##
src/parser/mod.rs:
##
@@ -15064,6 +15117,38 @@ impl<'a> Parser<'a> {
}))
}
+/// Parse [Statement::Go]
+fn parse_go(&mut self) -> Result {
+self.expect_previously_only_whitespace_until_newline()?;
Review Comment:
> hmm this call look equivalent to the loop below that peeks forward to skip
newlines? unclear why we're peeking backwards here right after we parsed the GO
keyword (so that this will always be infallible?)
The too loops aren't the same. The reason we need to peek look backwards
here is to forbid this syntax:
```sql
-- `GO` is disallowed
select 1; go
```
If you actually run this on a real SQL Server instance, you will also get a
similar error: `Incorrect syntax near 'go'.`. Therefore it seems prudent to
forbid this syntax from parsing here in this library.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc commented on code in PR #1809:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2052450841
##
src/parser/mod.rs:
##
@@ -15064,6 +15117,38 @@ impl<'a> Parser<'a> {
}))
}
+/// Parse [Statement::Go]
+fn parse_go(&mut self) -> Result {
+self.expect_previously_only_whitespace_until_newline()?;
+
+let count = loop {
+// using this peek function because we want to halt this statement
parsing upon newline
Review Comment:
Yes, I can add that comment. However there is also test coverage for this
behavior, so it should be protected from future refactoring
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc commented on code in PR #1809:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2052449901
##
src/parser/mod.rs:
##
@@ -618,6 +632,7 @@ impl<'a> Parser<'a> {
// `COMMENT` is snowflake specific
https://docs.snowflake.com/en/sql-reference/sql/comment
Keyword::COMMENT if self.dialect.supports_comment_on() =>
self.parse_comment(),
Keyword::PRINT => self.parse_print(),
+Keyword::GO => self.parse_go(),
Review Comment:
I see that some of the statement types in parse_statement are doing that but
most aren't. What's the strategy on this? How do you know when you're supposed
to do it or not?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc commented on code in PR #1809:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2050966075
##
tests/sqlparser_mssql.rs:
##
@@ -2053,3 +2054,171 @@ fn parse_drop_trigger() {
}
);
}
+
+#[test]
+fn parse_mssql_go_keyword() {
+let single_go_keyword = "USE some_database;\nGO";
+let stmts = ms().parse_sql_statements(single_go_keyword).unwrap();
+assert_eq!(stmts.len(), 2);
+assert_eq!(stmts[1], Statement::Go(GoStatement { count: None }),);
+
+let go_with_count = "SELECT 1;\nGO 5";
+let stmts = ms().parse_sql_statements(go_with_count).unwrap();
+assert_eq!(stmts.len(), 2);
+assert_eq!(stmts[1], Statement::Go(GoStatement { count: Some(5) }));
+
+let go_statement_delimiter = "SELECT 1\nGO";
+let stmts = ms().parse_sql_statements(go_statement_delimiter).unwrap();
+assert_eq!(stmts.len(), 2);
+assert_eq!(stmts[1], Statement::Go(GoStatement { count: None }));
+
+let bare_go = "GO";
+let stmts = ms().parse_sql_statements(bare_go).unwrap();
+assert_eq!(stmts.len(), 1);
+assert_eq!(stmts[0], Statement::Go(GoStatement { count: None }));
+
+let go_then_statements = "/* whitespace */ GO\nRAISERROR('This is a test',
16, 1);";
+let stmts = ms().parse_sql_statements(go_then_statements).unwrap();
+assert_eq!(stmts.len(), 2);
+assert_eq!(stmts[0], Statement::Go(GoStatement { count: None }));
+assert_eq!(
+stmts[1],
+Statement::RaisError {
+message: Box::new(Expr::Value(
+(Value::SingleQuotedString("This is a
test".to_string())).with_empty_span()
+)),
+severity: Box::new(Expr::Value(number("16").with_empty_span())),
+state: Box::new(Expr::Value(number("1").with_empty_span())),
+arguments: vec![],
+options: vec![],
+}
+);
+
+let multiple_gos = "SELECT 1;\nGO 5\nSELECT 2;\n GO";
+let stmts = ms().parse_sql_statements(multiple_gos).unwrap();
+assert_eq!(stmts.len(), 4);
+assert_eq!(stmts[1], Statement::Go(GoStatement { count: Some(5) }));
+assert_eq!(stmts[3], Statement::Go(GoStatement { count: None }));
+
+let single_line_comment_preceding_go = "USE some_database; -- okay\nGO";
+let stmts = ms()
+.parse_sql_statements(single_line_comment_preceding_go)
+.unwrap();
+assert_eq!(stmts.len(), 2);
+assert_eq!(stmts[1], Statement::Go(GoStatement { count: None }));
+
+let multi_line_comment_preceding_go = "USE some_database; /* okay */\nGO";
+let stmts = ms()
+.parse_sql_statements(multi_line_comment_preceding_go)
+.unwrap();
+assert_eq!(stmts.len(), 2);
+assert_eq!(stmts[1], Statement::Go(GoStatement { count: None }));
+
+let comment_following_go = "USE some_database;\nGO -- okay";
Review Comment:
Done 👍
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
iffyio commented on code in PR #1809:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2051416091
##
src/parser/mod.rs:
##
@@ -4055,6 +4070,44 @@ impl<'a> Parser<'a> {
)
}
+/// Look backwards in the token stream and expect that there was only
whitespace tokens until the previous newline
+pub fn expect_previously_only_whitespace_until_newline(&mut self) ->
Result<(), ParserError> {
Review Comment:
```suggestion
pub(crate) fn expect_previously_only_whitespace_until_newline(&mut self)
-> Result<(), ParserError> {
```
##
src/parser/mod.rs:
##
@@ -15064,6 +15117,38 @@ impl<'a> Parser<'a> {
}))
}
+/// Parse [Statement::Go]
+fn parse_go(&mut self) -> Result {
+self.expect_previously_only_whitespace_until_newline()?;
Review Comment:
hmm this call look equivalent to the loop below that peeks forward to skip
newlines? unclear why we're peeking backwards here right after we parsed the GO
keyword (so that this will always be infallible?)
##
src/parser/mod.rs:
##
@@ -15064,6 +15117,38 @@ impl<'a> Parser<'a> {
}))
}
+/// Parse [Statement::Go]
+fn parse_go(&mut self) -> Result {
+self.expect_previously_only_whitespace_until_newline()?;
+
+let count = loop {
+// using this peek function because we want to halt this statement
parsing upon newline
Review Comment:
Can we include the example you had in the comment earlier, explicitly
highlighting why this statement is special? I think otherwise it would not be
obvious to folks that come across the code later on why the current code is a
special case
##
tests/sqlparser_mssql.rs:
##
@@ -2053,3 +2054,171 @@ fn parse_drop_trigger() {
}
);
}
+
+#[test]
+fn parse_mssql_go_keyword() {
+let single_go_keyword = "USE some_database;\nGO";
Review Comment:
Ah yeah a new helper sounds reasonable to me thanks!
##
src/parser/mod.rs:
##
@@ -618,6 +632,7 @@ impl<'a> Parser<'a> {
// `COMMENT` is snowflake specific
https://docs.snowflake.com/en/sql-reference/sql/comment
Keyword::COMMENT if self.dialect.supports_comment_on() =>
self.parse_comment(),
Keyword::PRINT => self.parse_print(),
+Keyword::GO => self.parse_go(),
Review Comment:
can we rewind with `self.prev_token()` here before calling parse_go? so that
the `parse_go` function is self contained and can parse a full GO statement. (I
also realise now we missed that for print)
##
src/dialect/mssql.rs:
##
@@ -116,7 +116,17 @@ impl Dialect for MsSqlDialect {
true
}
-fn is_column_alias(&self, kw: &Keyword, _parser: &mut Parser) -> bool {
+fn is_column_alias(&self, kw: &Keyword, parser: &mut Parser) -> bool {
+// if we find maybe whitespace then a newline looking backward, then
`GO` ISN'T a column alias
+// if we can't find a newline then we assume that `GO` IS a column
alias
+if kw == &Keyword::GO
+&& parser
+.expect_previously_only_whitespace_until_newline()
+.is_ok()
Review Comment:
if the intent is for the condition to return a bool, then we can update this
function to return a bool instead of a result (where the latter is flagging an
error to be handled)
##
src/test_utils.rs:
##
@@ -166,6 +168,32 @@ impl TestedDialects {
only_statement
}
+/// The same as [`one_statement_parses_to`] but it works for a multiple
statements
+pub fn multiple_statements_parse_to(
Review Comment:
```suggestion
pub fn statements_parse_to(
```
##
src/parser/mod.rs:
##
@@ -4055,6 +4070,44 @@ impl<'a> Parser<'a> {
)
}
+/// Look backwards in the token stream and expect that there was only
whitespace tokens until the previous newline
+pub fn expect_previously_only_whitespace_until_newline(&mut self) ->
Result<(), ParserError> {
Review Comment:
hmm so i think it would be good to write this function in a reusable manner
if possible since it feels similar to the `peek_*`, `prev_*` helper functions
that we have. Would it be possible to introduce a generic
`prev_nth_token_no_skip(n: usize)` that can be called by parser and dialect
methods?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
iffyio commented on code in PR #1809:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2051415729
##
src/ast/mod.rs:
##
@@ -4054,6 +4054,12 @@ pub enum Statement {
arguments: Vec,
options: Vec,
},
+/// Go (MSSQL)
+///
+/// GO is not a Transact-SQL statement; it is a command recognized by
various tools as a batch delimiter
+///
+/// See
Review Comment:
Ah indeed, ideally both would have the doc comments on their respective
structs, I think I meant to reference return statement in the other PR as the
example
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc commented on code in PR #1809:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2050966395
##
src/dialect/mssql.rs:
##
@@ -116,7 +116,29 @@ impl Dialect for MsSqlDialect {
true
}
-fn is_column_alias(&self, kw: &Keyword, _parser: &mut Parser) -> bool {
+fn is_column_alias(&self, kw: &Keyword, parser: &mut Parser) -> bool {
+// if:
+// - keyword is `GO`, and
+// - looking backwards there's only (any) whitespace preceded by a
newline
+// then: `GO` iSN'T a column alias
+if kw == &Keyword::GO {
+let mut look_back_count = 2;
+loop {
Review Comment:
I added a new helper, done 👍
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc commented on code in PR #1809:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2050966038
##
tests/sqlparser_mssql.rs:
##
@@ -2053,3 +2054,171 @@ fn parse_drop_trigger() {
}
);
}
+
+#[test]
+fn parse_mssql_go_keyword() {
+let single_go_keyword = "USE some_database;\nGO";
Review Comment:
I'm unsure the best way to approach this question. `one_statement_parses_to`
requires _one_ statement, right? But most of these SQL text examples are
multiple statements. I didn't see anything in TestUtils for a series of
statements, so I added a new helper accordingly with canonical strings, let me
know if that covers it
##
tests/sqlparser_mssql.rs:
##
@@ -2053,3 +2054,171 @@ fn parse_drop_trigger() {
}
);
}
+
+#[test]
+fn parse_mssql_go_keyword() {
+let single_go_keyword = "USE some_database;\nGO";
+let stmts = ms().parse_sql_statements(single_go_keyword).unwrap();
+assert_eq!(stmts.len(), 2);
+assert_eq!(stmts[1], Statement::Go(GoStatement { count: None }),);
+
+let go_with_count = "SELECT 1;\nGO 5";
+let stmts = ms().parse_sql_statements(go_with_count).unwrap();
+assert_eq!(stmts.len(), 2);
+assert_eq!(stmts[1], Statement::Go(GoStatement { count: Some(5) }));
+
+let go_statement_delimiter = "SELECT 1\nGO";
+let stmts = ms().parse_sql_statements(go_statement_delimiter).unwrap();
+assert_eq!(stmts.len(), 2);
+assert_eq!(stmts[1], Statement::Go(GoStatement { count: None }));
+
+let bare_go = "GO";
+let stmts = ms().parse_sql_statements(bare_go).unwrap();
+assert_eq!(stmts.len(), 1);
+assert_eq!(stmts[0], Statement::Go(GoStatement { count: None }));
+
+let go_then_statements = "/* whitespace */ GO\nRAISERROR('This is a test',
16, 1);";
+let stmts = ms().parse_sql_statements(go_then_statements).unwrap();
+assert_eq!(stmts.len(), 2);
+assert_eq!(stmts[0], Statement::Go(GoStatement { count: None }));
+assert_eq!(
+stmts[1],
+Statement::RaisError {
+message: Box::new(Expr::Value(
+(Value::SingleQuotedString("This is a
test".to_string())).with_empty_span()
+)),
+severity: Box::new(Expr::Value(number("16").with_empty_span())),
+state: Box::new(Expr::Value(number("1").with_empty_span())),
+arguments: vec![],
+options: vec![],
+}
+);
+
+let multiple_gos = "SELECT 1;\nGO 5\nSELECT 2;\n GO";
+let stmts = ms().parse_sql_statements(multiple_gos).unwrap();
+assert_eq!(stmts.len(), 4);
+assert_eq!(stmts[1], Statement::Go(GoStatement { count: Some(5) }));
+assert_eq!(stmts[3], Statement::Go(GoStatement { count: None }));
+
+let single_line_comment_preceding_go = "USE some_database; -- okay\nGO";
+let stmts = ms()
+.parse_sql_statements(single_line_comment_preceding_go)
+.unwrap();
+assert_eq!(stmts.len(), 2);
+assert_eq!(stmts[1], Statement::Go(GoStatement { count: None }));
+
+let multi_line_comment_preceding_go = "USE some_database; /* okay */\nGO";
+let stmts = ms()
+.parse_sql_statements(multi_line_comment_preceding_go)
+.unwrap();
+assert_eq!(stmts.len(), 2);
+assert_eq!(stmts[1], Statement::Go(GoStatement { count: None }));
+
+let comment_following_go = "USE some_database;\nGO -- okay";
Review Comment:
Done 👍
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc commented on code in PR #1809:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2050865353
##
tests/sqlparser_mssql.rs:
##
@@ -2053,3 +2054,171 @@ fn parse_drop_trigger() {
}
);
}
+
+#[test]
+fn parse_mssql_go_keyword() {
+let single_go_keyword = "USE some_database;\nGO";
+let stmts = ms().parse_sql_statements(single_go_keyword).unwrap();
+assert_eq!(stmts.len(), 2);
+assert_eq!(stmts[1], Statement::Go(GoStatement { count: None }),);
+
+let go_with_count = "SELECT 1;\nGO 5";
+let stmts = ms().parse_sql_statements(go_with_count).unwrap();
+assert_eq!(stmts.len(), 2);
+assert_eq!(stmts[1], Statement::Go(GoStatement { count: Some(5) }));
+
+let go_statement_delimiter = "SELECT 1\nGO";
+let stmts = ms().parse_sql_statements(go_statement_delimiter).unwrap();
+assert_eq!(stmts.len(), 2);
+assert_eq!(stmts[1], Statement::Go(GoStatement { count: None }));
+
+let bare_go = "GO";
+let stmts = ms().parse_sql_statements(bare_go).unwrap();
+assert_eq!(stmts.len(), 1);
+assert_eq!(stmts[0], Statement::Go(GoStatement { count: None }));
+
+let go_then_statements = "/* whitespace */ GO\nRAISERROR('This is a test',
16, 1);";
+let stmts = ms().parse_sql_statements(go_then_statements).unwrap();
+assert_eq!(stmts.len(), 2);
+assert_eq!(stmts[0], Statement::Go(GoStatement { count: None }));
+assert_eq!(
+stmts[1],
+Statement::RaisError {
+message: Box::new(Expr::Value(
+(Value::SingleQuotedString("This is a
test".to_string())).with_empty_span()
+)),
+severity: Box::new(Expr::Value(number("16").with_empty_span())),
+state: Box::new(Expr::Value(number("1").with_empty_span())),
+arguments: vec![],
+options: vec![],
+}
+);
+
+let multiple_gos = "SELECT 1;\nGO 5\nSELECT 2;\n GO";
+let stmts = ms().parse_sql_statements(multiple_gos).unwrap();
+assert_eq!(stmts.len(), 4);
+assert_eq!(stmts[1], Statement::Go(GoStatement { count: Some(5) }));
+assert_eq!(stmts[3], Statement::Go(GoStatement { count: None }));
+
+let single_line_comment_preceding_go = "USE some_database; -- okay\nGO";
+let stmts = ms()
+.parse_sql_statements(single_line_comment_preceding_go)
+.unwrap();
+assert_eq!(stmts.len(), 2);
+assert_eq!(stmts[1], Statement::Go(GoStatement { count: None }));
+
+let multi_line_comment_preceding_go = "USE some_database; /* okay */\nGO";
+let stmts = ms()
+.parse_sql_statements(multi_line_comment_preceding_go)
+.unwrap();
+assert_eq!(stmts.len(), 2);
+assert_eq!(stmts[1], Statement::Go(GoStatement { count: None }));
+
+let comment_following_go = "USE some_database;\nGO -- okay";
Review Comment:
Done 👍
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc commented on code in PR #1809:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2050847853
##
src/parser/mod.rs:
##
@@ -15017,6 +15026,48 @@ impl<'a> Parser<'a> {
}
}
+fn parse_go(&mut self) -> Result {
+// previous token should be a newline (skipping non-newline whitespace)
Review Comment:
Yes, I have something like that already for parsing the batch count, which
doesn't need to look backwards and already uses peek_token_no_skip. I think
between here & your other comment on the alias function, the concern is
regarding the newline logic which looks backwards. I'll try and move that into
a helper so it can be simplified.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc commented on code in PR #1809:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2050843298
##
src/ast/mod.rs:
##
@@ -4054,6 +4054,12 @@ pub enum Statement {
arguments: Vec,
options: Vec,
},
+/// Go (MSSQL)
+///
+/// GO is not a Transact-SQL statement; it is a command recognized by
various tools as a batch delimiter
+///
+/// See
Review Comment:
This is already the same as print 🤔. Are you saying both comments should
move to their respective structs?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc commented on code in PR #1809:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2050836373
##
src/dialect/mssql.rs:
##
@@ -116,7 +116,29 @@ impl Dialect for MsSqlDialect {
true
}
-fn is_column_alias(&self, kw: &Keyword, _parser: &mut Parser) -> bool {
+fn is_column_alias(&self, kw: &Keyword, parser: &mut Parser) -> bool {
+// if:
+// - keyword is `GO`, and
+// - looking backwards there's only (any) whitespace preceded by a
newline
+// then: `GO` iSN'T a column alias
+if kw == &Keyword::GO {
+let mut look_back_count = 2;
Review Comment:
It's similar to parse_go, which references the existing previous token logic
which also starts by subtracting 2. My understanding is that 0 is the next
token, -1 is the current token, and -2 is the first previous token.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
iffyio commented on code in PR #1809:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2050229485
##
tests/sqlparser_mssql.rs:
##
@@ -2053,3 +2054,171 @@ fn parse_drop_trigger() {
}
);
}
+
+#[test]
+fn parse_mssql_go_keyword() {
+let single_go_keyword = "USE some_database;\nGO";
Review Comment:
Can we have one of the test cases use `ms().one_statement_parses_to(sql,
canonical)`? looking at the tests its unclear what the behavior of the parser
is in terms of serializing an AST that has GO statements in it, and that it
needs special behavior to inject new lines manually I imagine
##
tests/sqlparser_mssql.rs:
##
@@ -2053,3 +2054,171 @@ fn parse_drop_trigger() {
}
);
}
+
+#[test]
+fn parse_mssql_go_keyword() {
+let single_go_keyword = "USE some_database;\nGO";
+let stmts = ms().parse_sql_statements(single_go_keyword).unwrap();
+assert_eq!(stmts.len(), 2);
+assert_eq!(stmts[1], Statement::Go(GoStatement { count: None }),);
+
+let go_with_count = "SELECT 1;\nGO 5";
+let stmts = ms().parse_sql_statements(go_with_count).unwrap();
+assert_eq!(stmts.len(), 2);
+assert_eq!(stmts[1], Statement::Go(GoStatement { count: Some(5) }));
+
+let go_statement_delimiter = "SELECT 1\nGO";
+let stmts = ms().parse_sql_statements(go_statement_delimiter).unwrap();
+assert_eq!(stmts.len(), 2);
+assert_eq!(stmts[1], Statement::Go(GoStatement { count: None }));
+
+let bare_go = "GO";
+let stmts = ms().parse_sql_statements(bare_go).unwrap();
+assert_eq!(stmts.len(), 1);
+assert_eq!(stmts[0], Statement::Go(GoStatement { count: None }));
+
+let go_then_statements = "/* whitespace */ GO\nRAISERROR('This is a test',
16, 1);";
+let stmts = ms().parse_sql_statements(go_then_statements).unwrap();
+assert_eq!(stmts.len(), 2);
+assert_eq!(stmts[0], Statement::Go(GoStatement { count: None }));
+assert_eq!(
+stmts[1],
+Statement::RaisError {
+message: Box::new(Expr::Value(
+(Value::SingleQuotedString("This is a
test".to_string())).with_empty_span()
+)),
+severity: Box::new(Expr::Value(number("16").with_empty_span())),
+state: Box::new(Expr::Value(number("1").with_empty_span())),
+arguments: vec![],
+options: vec![],
+}
+);
+
+let multiple_gos = "SELECT 1;\nGO 5\nSELECT 2;\n GO";
+let stmts = ms().parse_sql_statements(multiple_gos).unwrap();
+assert_eq!(stmts.len(), 4);
+assert_eq!(stmts[1], Statement::Go(GoStatement { count: Some(5) }));
+assert_eq!(stmts[3], Statement::Go(GoStatement { count: None }));
+
+let single_line_comment_preceding_go = "USE some_database; -- okay\nGO";
+let stmts = ms()
+.parse_sql_statements(single_line_comment_preceding_go)
+.unwrap();
+assert_eq!(stmts.len(), 2);
+assert_eq!(stmts[1], Statement::Go(GoStatement { count: None }));
+
+let multi_line_comment_preceding_go = "USE some_database; /* okay */\nGO";
+let stmts = ms()
+.parse_sql_statements(multi_line_comment_preceding_go)
+.unwrap();
+assert_eq!(stmts.len(), 2);
+assert_eq!(stmts[1], Statement::Go(GoStatement { count: None }));
+
+let comment_following_go = "USE some_database;\nGO -- okay";
Review Comment:
maybe we can add a `multine_line_comment_following_go` test case? e.g. `GO
/* whitespace */ 42`
##
src/dialect/mssql.rs:
##
@@ -116,7 +116,29 @@ impl Dialect for MsSqlDialect {
true
}
-fn is_column_alias(&self, kw: &Keyword, _parser: &mut Parser) -> bool {
+fn is_column_alias(&self, kw: &Keyword, parser: &mut Parser) -> bool {
+// if:
+// - keyword is `GO`, and
+// - looking backwards there's only (any) whitespace preceded by a
newline
+// then: `GO` iSN'T a column alias
+if kw == &Keyword::GO {
+let mut look_back_count = 2;
+loop {
Review Comment:
Can we have this logic as a helper function?
``` rust
fn prev_nth_token_no_skip(&self, mut n: usize) -> Option<&TokenWithSpan> {
// ...
}
```
like a combination of
[peek_nth_token](https://github.com/apache/datafusion-sqlparser-rs/blob/ed181fde55fa090b42f047bb0fbb84cf494904ef/src/parser/mod.rs#L3922C5-L3922C69)
and
[prev_token](https://github.com/apache/datafusion-sqlparser-rs/blob/ed181fde55fa090b42f047bb0fbb84cf494904ef/src/parser/mod.rs#L4034)
##
src/ast/mod.rs:
##
@@ -4054,6 +4054,12 @@ pub enum Statement {
arguments: Vec,
options: Vec,
},
+/// Go (MSSQL)
+///
+/// GO is not a Transact-SQL statement; it is a command recognized by
various tools as a batch delimiter
+///
+/// See
Review Comment:
For visibility we can move
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc commented on code in PR #1809:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2045640591
##
src/parser/mod.rs:
##
@@ -15058,6 +15069,57 @@ impl<'a> Parser<'a> {
}
}
+/// Parse [Statement::Go]
+fn parse_go(&mut self) -> Result {
+// previous token should be a newline (skipping non-newline whitespace)
+// see also, `previous_token`
+let mut look_back_count = 2;
+loop {
Review Comment:
This can maybe be simplified. Also, needs a test when `go` starts a file but
followed by other statements
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc commented on PR #1809: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1809#issuecomment-2806522733 Rebased on main & updated -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc commented on code in PR #1809:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2043088266
##
src/parser/mod.rs:
##
@@ -475,6 +475,12 @@ impl<'a> Parser<'a> {
if expecting_statement_delimiter && word.keyword ==
Keyword::END {
break;
}
+// Treat batch delimiter as an end of statement
+if expecting_statement_delimiter && dialect_of!(self is
MsSqlDialect) {
+if let Some(Statement::Go { count: _ }) = stmts.last()
{
+expecting_statement_delimiter = false;
+}
+}
Review Comment:
I've now done that due to another problem I noticed, for IF statements (new
test case also added...)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc commented on code in PR #1809:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2042385712
##
src/parser/mod.rs:
##
@@ -15017,6 +15026,48 @@ impl<'a> Parser<'a> {
}
}
+fn parse_go(&mut self) -> Result {
+// previous token should be a newline (skipping non-newline whitespace)
Review Comment:
Does [my comment
above](https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2042359100)
& the code comment below, "using this peek function because we want to halt
this statement parsing upon newline" help?
```mssql
-- this should parse as `Go { Some(4) }`
GO 4
-- this should parse as something like `Go { None }`, `Expr(4)`
GO
4
```
Eg, the whitespace is meaningful for parsing
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc commented on code in PR #1809:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2042947584
##
src/parser/mod.rs:
##
@@ -475,6 +475,12 @@ impl<'a> Parser<'a> {
if expecting_statement_delimiter && word.keyword ==
Keyword::END {
break;
}
+// Treat batch delimiter as an end of statement
+if expecting_statement_delimiter && dialect_of!(self is
MsSqlDialect) {
+if let Some(Statement::Go { count: _ }) = stmts.last()
{
+expecting_statement_delimiter = false;
+}
+}
Review Comment:
In terms of the code, I worked out a perhaps slightly less confusing way to
set whether or not a semi-colon is expected,
[here](https://github.com/apache/datafusion-sqlparser-rs/pull/1808/files#diff-fc6a8d66b8cb6bd48a119a70e489cbcef82e7f551e7cf08e8e972ef4e774ef49R487-R493).
This branch can be updated accordingly if we agree over there
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc commented on code in PR #1809:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2042385712
##
src/parser/mod.rs:
##
@@ -15017,6 +15026,48 @@ impl<'a> Parser<'a> {
}
}
+fn parse_go(&mut self) -> Result {
+// previous token should be a newline (skipping non-newline whitespace)
Review Comment:
Does [my comment
above](https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2042359100)
& the code comment below, "using this peek function because we want to halt
this statement parsing upon newline" help?
```mssql
-- this should parse as `Go { Some(4) }`
GO 4`
-- this should parse as something like `Go { None }`, `Expr(4)`
GO
4
```
Eg, the whitespace is meaningful for parsing
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc commented on code in PR #1809:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2042359100
##
src/parser/mod.rs:
##
@@ -475,6 +475,12 @@ impl<'a> Parser<'a> {
if expecting_statement_delimiter && word.keyword ==
Keyword::END {
break;
}
+// Treat batch delimiter as an end of statement
+if expecting_statement_delimiter && dialect_of!(self is
MsSqlDialect) {
+if let Some(Statement::Go { count: _ }) = stmts.last()
{
+expecting_statement_delimiter = false;
+}
+}
Review Comment:
I tried to explain this in the commit message & the "multiple_gos" example
in the test. Basically, we want this to parse as 4 statements:
```mssql
SELECT 1;
GO
SELECT 2;
GO
```
The second `GO` is terminated by EOF, so no problem. But the first `GO`
_cannot_ end with a semi colon (This overlaps with the ongoing "no semi-colons
for statements" discussion but from a different angle). So we need to figure
out how to parse it all out without the existing conventional semi-colon logic.
What I came up with is to basically just look for a GO and have that signal
that we _aren't_ then also looking for statement delimiter (eg, semi-colon).
This also makes sense semantically because statements can't extend between
batches, so any batch delimiter also functions as a statement delimiter.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc commented on code in PR #1809:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2042317653
##
tests/sqlparser_mssql.rs:
##
@@ -2036,3 +2036,78 @@ fn parse_mssql_merge_with_output() {
OUTPUT $action, deleted.ProductID INTO dsi.temp_products";
ms_and_generic().verified_stmt(stmt);
}
+
+#[test]
+fn parser_mssql_go_keyword() {
+let single_go_keyword = "USE some_database;\nGO";
+let stmts = ms().parse_sql_statements(single_go_keyword).unwrap();
+assert_eq!(stmts.len(), 2);
+assert_eq!(
+stmts[1],
+Statement::Go {
+count: None,
+}
+);
+
+let go_with_count = "SELECT 1;\nGO 5";
+let stmts = ms().parse_sql_statements(go_with_count).unwrap();
+assert_eq!(stmts.len(), 2);
+assert_eq!(
+stmts[1],
+Statement::Go {
+count: Some(5),
+}
+);
+
+let multiple_gos = "SELECT 1;\nGO 5\nSELECT 2;\n GO";
+let stmts = ms().parse_sql_statements(multiple_gos).unwrap();
+assert_eq!(stmts.len(), 4);
+assert_eq!(
+stmts[1],
+Statement::Go {
+count: Some(5),
+}
+);
+assert_eq!(
+stmts[3],
+Statement::Go {
+count: None,
+}
+);
+
+let comment_following_go = "USE some_database;\nGO -- okay";
+let stmts = ms().parse_sql_statements(comment_following_go).unwrap();
+assert_eq!(stmts.len(), 2);
+assert_eq!(
+stmts[1],
+Statement::Go {
+count: None,
+}
+);
+
+let actually_column_alias = "SELECT NULL AS GO";
+let stmt = ms().verified_only_select(actually_column_alias);
+assert_eq!(
+only(stmt.projection),
+SelectItem::ExprWithAlias {
+expr: Expr::Value(Value::Null.with_empty_span()),
+alias: Ident::new("GO"),
+}
+);
+
+let invalid_go_position = "SELECT 1; GO";
+let err = ms().parse_sql_statements(invalid_go_position);
+assert!(err.is_err());
+assert_eq!(
+err.unwrap_err().to_string(),
+"sql parser error: Expected: newline before GO, found: ;"
+);
+
+let invalid_go_count = "SELECT 1\nGO x";
+let err = ms().parse_sql_statements(invalid_go_count);
+assert!(err.is_err());
Review Comment:
Ah yes great point, done 👍
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc commented on code in PR #1809:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2042331806
##
src/ast/mod.rs:
##
@@ -4050,6 +4050,14 @@ pub enum Statement {
arguments: Vec,
options: Vec,
},
+/// Go (SQL Server)
+///
+/// GO is not a Transact-SQL statement; it is a command recognized by
various tools as a batch delimiter
+///
+/// See:
https://learn.microsoft.com/en-us/sql/t-sql/language-elements/sql-server-utilities-statements-go
+Go {
+count: Option,
+},
Review Comment:
Done 👍
##
src/parser/mod.rs:
##
@@ -617,6 +623,9 @@ impl<'a> Parser<'a> {
}
// `COMMENT` is snowflake specific
https://docs.snowflake.com/en/sql-reference/sql/comment
Keyword::COMMENT if self.dialect.supports_comment_on() =>
self.parse_comment(),
+Keyword::GO if dialect_of!(self is MsSqlDialect) => {
Review Comment:
Done 👍
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc commented on code in PR #1809:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2042317653
##
tests/sqlparser_mssql.rs:
##
@@ -2036,3 +2036,78 @@ fn parse_mssql_merge_with_output() {
OUTPUT $action, deleted.ProductID INTO dsi.temp_products";
ms_and_generic().verified_stmt(stmt);
}
+
+#[test]
+fn parser_mssql_go_keyword() {
+let single_go_keyword = "USE some_database;\nGO";
+let stmts = ms().parse_sql_statements(single_go_keyword).unwrap();
+assert_eq!(stmts.len(), 2);
+assert_eq!(
+stmts[1],
+Statement::Go {
+count: None,
+}
+);
+
+let go_with_count = "SELECT 1;\nGO 5";
+let stmts = ms().parse_sql_statements(go_with_count).unwrap();
+assert_eq!(stmts.len(), 2);
+assert_eq!(
+stmts[1],
+Statement::Go {
+count: Some(5),
+}
+);
+
+let multiple_gos = "SELECT 1;\nGO 5\nSELECT 2;\n GO";
+let stmts = ms().parse_sql_statements(multiple_gos).unwrap();
+assert_eq!(stmts.len(), 4);
+assert_eq!(
+stmts[1],
+Statement::Go {
+count: Some(5),
+}
+);
+assert_eq!(
+stmts[3],
+Statement::Go {
+count: None,
+}
+);
+
+let comment_following_go = "USE some_database;\nGO -- okay";
+let stmts = ms().parse_sql_statements(comment_following_go).unwrap();
+assert_eq!(stmts.len(), 2);
+assert_eq!(
+stmts[1],
+Statement::Go {
+count: None,
+}
+);
+
+let actually_column_alias = "SELECT NULL AS GO";
+let stmt = ms().verified_only_select(actually_column_alias);
+assert_eq!(
+only(stmt.projection),
+SelectItem::ExprWithAlias {
+expr: Expr::Value(Value::Null.with_empty_span()),
+alias: Ident::new("GO"),
+}
+);
+
+let invalid_go_position = "SELECT 1; GO";
+let err = ms().parse_sql_statements(invalid_go_position);
+assert!(err.is_err());
+assert_eq!(
+err.unwrap_err().to_string(),
+"sql parser error: Expected: newline before GO, found: ;"
+);
+
+let invalid_go_count = "SELECT 1\nGO x";
+let err = ms().parse_sql_statements(invalid_go_count);
+assert!(err.is_err());
Review Comment:
Ah yes great point
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
iffyio commented on code in PR #1809:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1809#discussion_r2040595052
##
src/parser/mod.rs:
##
@@ -617,6 +623,9 @@ impl<'a> Parser<'a> {
}
// `COMMENT` is snowflake specific
https://docs.snowflake.com/en/sql-reference/sql/comment
Keyword::COMMENT if self.dialect.supports_comment_on() =>
self.parse_comment(),
+Keyword::GO if dialect_of!(self is MsSqlDialect) => {
Review Comment:
```suggestion
Keyword::GO => {
```
I think it should be fine to let the parser always accept the statement if
it shows up. Technically if we wanted to gate it behind a feature then we could
use a `self.dialect.supports` flag since the `dialect_of` macro is deprecated,
but I think it makes more sense to let the parser accept unconditionally
##
src/parser/mod.rs:
##
@@ -475,6 +475,12 @@ impl<'a> Parser<'a> {
if expecting_statement_delimiter && word.keyword ==
Keyword::END {
break;
}
+// Treat batch delimiter as an end of statement
+if expecting_statement_delimiter && dialect_of!(self is
MsSqlDialect) {
+if let Some(Statement::Go { count: _ }) = stmts.last()
{
+expecting_statement_delimiter = false;
+}
+}
Review Comment:
not sure I followed this part, what was the intention?
##
src/parser/mod.rs:
##
@@ -15017,6 +15026,48 @@ impl<'a> Parser<'a> {
}
}
+fn parse_go(&mut self) -> Result {
Review Comment:
One thing we could do, in the `parse_stmt()` function could we call
`self.prev_token()` to rewind the `Go` keyword so that this function becomes
self contained in being able to parse a full `Go` statement?
##
src/parser/mod.rs:
##
@@ -15017,6 +15026,48 @@ impl<'a> Parser<'a> {
}
}
+fn parse_go(&mut self) -> Result {
+// previous token should be a newline (skipping non-newline whitespace)
Review Comment:
hmm I didn't look too closely but not sure I followed the function body on a
high level, from the docs and the representation in the parser the Go statement
only contains an optional int, so that I imagined all the function needs to do
would be to `maybe_parse(|parser| parser.parse_number_value())` or similar? The
function is currently managing whitespace and semicolon but not super clear to
me why that is required
##
src/ast/mod.rs:
##
@@ -4050,6 +4050,14 @@ pub enum Statement {
arguments: Vec,
options: Vec,
},
+/// Go (SQL Server)
+///
+/// GO is not a Transact-SQL statement; it is a command recognized by
various tools as a batch delimiter
+///
+/// See:
https://learn.microsoft.com/en-us/sql/t-sql/language-elements/sql-server-utilities-statements-go
+Go {
+count: Option,
+},
Review Comment:
Repr wise can we wrap the statement body in explicit structs? we're[
transitioning
away](https://github.com/apache/datafusion-sqlparser-rs/issues/1204) from
anonymous structs in order to make the API more ergonomic. Something like
```rust
struct GoStatement {
count: Option
}
Statement::Go(GoStatement)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
[PR] Add support for `GO` batch delimiter in SQL Server [datafusion-sqlparser-rs]
aharpervc opened a new pull request, #1809: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1809 Reference: https://learn.microsoft.com/en-us/sql/t-sql/language-elements/sql-server-utilities-statements-go Lots of conventional SQL Server tooling supports `GO`, so it seems really useful & prudent to be able to parse it. I realize that strictly speaking, it isn't a SQL "statement". However, I think it makes sense to model here as a statement since it acts as one in all regards except documentation. In terms of parsing, `GO` has some peculiarities I tried to express with the new tests, such as with regards to position and basically treating newline as a delimiter. This also starts to nudge on https://github.com/apache/datafusion-sqlparser-rs/issues/1800, where newlines may possibly be interpreted as statement delimiters in a general case. However, my approach here for `GO` is isolated to only that statement, at least for the time being. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
