Github user jodersky commented on the issue:

    https://github.com/apache/spark/pull/15398
  
    @gatorsmile, I updated the PR according to your comments.
    
    Whilst adding the example test
    ```scala
     checkEvaluation("""%SystemDrive%\Users\John""" like 
"""\%SystemDrive\%\\Users%""", true)
    ```
    I noticed that the equivalent SQL string:
    ```scala
    spark.sql("""'%SystemDrive%\Users\John like '\%SystemDrive\%\\Users%'""")
    ```
    will throw an exception:
    `org.apache.spark.sql.AnalysisException: the pattern 
'\%SystemDrive\%\Users%' is invalid, the escape character is not allowed to 
precede 'U';` (Note that a backslash is missing before 'Users')
    
    I traced down the issue to the ANTLR4 parser, that will call 
[unescapeSQLString](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/ParserUtils.scala#L100)
 on the string passed to `spark.sql`, replacing any escaped character except 
for `\_` and `\%`, and thereby removing one backslash before "Users". To solve 
the issue there are two potential solutions, both flawed in my opinion:
    
    1. Escape the backlash twice, requiring 4 backslashes in total. 
`spark.sql("""'%SystemDrive%\Users\John like '\%SystemDrive\%\\Users%'""")`
    This is inconsistent as escaping the percent requires only one backslash
    
    2. Add a special rule to the parser that ignores double slashes. This won't 
place nice with custom escape characters in the future.
    
    My question is more fundamental, why are SQL strings escaped in the first 
place? Should it not be up to the frontend language to escape such strings? 
I.e. scala/java/python should handle replacing things such as `\n` with the 
newline or replacing unicode sequences with their corresponding symbols. 
Getting rid of the extra unescaping would be the cleanest solution to avoid 
double-escaping in my opinion


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to