mvzink commented on code in PR #1986: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1986#discussion_r2248453508
########## src/parser/mod.rs: ########## @@ -1248,6 +1248,12 @@ impl<'a> Parser<'a> { debug!("parsing expr"); let mut expr = self.parse_prefix()?; + // We would have exited early in `parse_prefix` before checking for `COLLATE`, and there's + // no infix operator handling for `COLLATE`, so we must return now. + if self.in_column_definition_state() && self.peek_keyword(Keyword::COLLATE) { + return Ok(expr); + } Review Comment: If there is no infix handling for a given token, and we try to `parse_infix` anyway, we get a `No infix parser for token` error. In practice, we avoid this (I would say accidentally) for all dialects other than PostgreSQL, because the default precedence of `COLLATE` is 0. In PostgreSQL, it is 120. Consider column options `DEFAULT 'foo' COLLATE 'en-US'` Without this early return, after parsing `'foo'`, we will flow through to checking the precedence of the next token. By default, it is `0`, which is `<=` the current precedence (also 0), so we break and return `'foo'`. But for PostgreSQL, it will be 120, and we will flow into the infix parsing (i.e. treating `COLLATE` as an infix operator, which we don't handle because technically it's not). The result is this: ``` 2025-08-01T16:56:30.976Z DEBUG [sqlparser::parser] prefix: Value(ValueWithSpan { value: SingleQuotedString("foo"), span: Span(Location(0,0)..Location(0,0)) }) 2025-08-01T16:56:30.976Z DEBUG [sqlparser::dialect::postgresql] get_next_precedence() TokenWithSpan { token: Word(Word { value: "COLLATE", quote_style: None, keyword: COLLATE }), span: Span(Location(0,0)..Location(0,0)) } 2025-08-01T16:56:30.976Z DEBUG [sqlparser::parser] next precedence: 120 2025-08-01T16:56:30.976Z DEBUG [sqlparser::parser] infix: TokenWithSpan { token: Word(Word { value: "COLLATE", quote_style: None, keyword: COLLATE }), span: Span(Location(0,0)..Location(0,0)) } thread 'test_parse_default_with_collate_column_option' panicked at src/test_utils.rs:157:61: CREATE TABLE foo (abc TEXT DEFAULT 'foo' COLLATE 'en_US'): ParserError("No infix parser for token Word(Word { value: \"COLLATE\", quote_style: None, keyword: COLLATE })") ``` I am not 100% this special case is the best way to fix this, but in my understanding it is necessary so long as we have special handling for `COLLATE` in `parse_prefix`; and that, in turn, is necessary so long as we don't parse the RHS as an expression. I could experiment with treating `COLLATE` as an infix operator, but I don't really know how that would go; at least PostgreSQL and MySQL don't allow anything other than a single collation name in the righthand side. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org