iffyio commented on code in PR #1735:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1735#discussion_r1974849855
##########
src/dialect/mod.rs:
##########
@@ -201,6 +201,33 @@ pub trait Dialect: Debug + Any {
false
}
+ /// Determine whether the dialect strips the backslash when escaping LIKE
wildcards (%, _).
+ ///
+ /// [MySQL] has a special case when escaping single quoted strings which
leaves these unescaped
+ /// so they can be used in LIKE patterns without double-escaping (as is
necessary in other
+ /// escaping dialects, such as [Snowflake]). Generally, special characters
have escaping rules
+ /// causing them to be replaced with a different byte sequences (e.g.
`'\0'` becoming the zero
+ /// byte), and the default if an escaped character does not have a
specific escaping rule is to
+ /// strip the backslash (e.g. there is no rule for `h`, so `'\h' = 'h'`).
MySQL's special case
+ /// for ignoring LIKE wildcard escapes is to *not* strip the backslash, so
that `'\%' = '\\%'`.
+ /// This applies to all string literals though, not just those used in
LIKE patterns.
+ ///
+ /// ```text
+ /// mysql> select '\_', hex('\\'), hex('_'), hex('\_');
+ /// +----+-----------+----------+-----------+
+ /// | \_ | hex('\\') | hex('_') | hex('\_') |
+ /// +----+-----------+----------+-----------+
+ /// | \_ | 5C | 5F | 5C5F |
+ /// +----+-----------+----------+-----------+
+ /// 1 row in set (0.00 sec)
+ /// ```
+ ///
+ /// [MySQL]: https://dev.mysql.com/doc/refman/8.4/en/string-literals.html
+ /// [Snowflake]:
https://docs.snowflake.com/en/sql-reference/functions/like#usage-notes
+ fn ignores_like_wildcard_escapes(&self) -> bool {
Review Comment:
```suggestion
fn ignores_wildcard_escapes(&self) -> bool {
```
maybe we drop the `like` part? as the comment suggests if its nothing
special about the `LIKE` syntax and more of a general string literal escape
behavior
##########
src/tokenizer.rs:
##########
@@ -807,6 +807,9 @@ pub struct Tokenizer<'a> {
/// If true (the default), the tokenizer will un-escape literal
/// SQL strings See [`Tokenizer::with_unescape`] for more details.
unescape: bool,
+ /// If true, the tokenizer will not escape % and _, for use in in LIKE
patterns. See
+ /// [`Dialect::ignores_like_wildcard_escapes`] for more details.
+ ignore_like_wildcard_escapes: bool,
Review Comment:
was it a reason to store this value here vs relying solely on the dialect
via `self.dialect.ignores_like_wildcard_escapes()` when needed?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]