[GitHub] [iceberg] manisin opened a new pull request, #6674: Add support for special characters in snowflake identifiers for Snowflake Catalog

via GitHub Thu, 26 Jan 2023 11:55:35 -0800


manisin opened a new pull request, #6674:
URL: https://github.com/apache/iceberg/pull/6674


   Currently the catalog is unable to handle databases or schema or table names 
(snowflake identifiers) with special characters. This limitation is due to 
sanitizing of the parameter for the like clause of SQL statements (which is not 
well supported by Snowflake JDBC's PreparedStatement) to determine if a 
database or schema exists (eg: SHOW DATABASES LIKE '%s' IN ACCOUNT). The 
alternative proposal is to list sub-namespace (eg: list schemas within a 
database (SHOW SCHEMAS/TABLES IN IDENTIFIER(?) LIMIT 1) or list tables within 
schema) and rely on the exception being thrown if the database/schema would not 
exist. This avoids using the like statement (while still being performant) and 
allows passing in the snowflake identifiers with special characters as quoted 
identifiers (more details at 
https://docs.snowflake.com/en/sql-reference/identifiers-syntax.html). The table 
shows the corresponding commands for creating and accessing database/schema 
objects in snowflake and spark. The column "Q
 uoted" refers to whether the identifier (database/schema name) is quoted as 
per snowflake's identifier convention.
   
   ```
   
+----+------------------------------------+-------------------+--------------------------------------------------+
   | Row| Snowflake                          | Quoted?| Spark                   
                                    |
   
+----+------------------------------------+-------------------+--------------------------------------------------+
   | 1  | create database "$peci@al"         | Yes    | spark.sql("use database 
\"$peci@l\"").show()                |
   | 2  | create database """doubleQouted""" | Yes    | spark.sql("use database 
\"\"\"doubleQuoted\"\"\"").show()   |
   | 3  | create database "Dot.ted"          | Yes    | spark.sql("use database 
\"Dot.ted\"").show()                |
   | 4  | create schema "under_Score"        | Yes    | spark.sql("use database 
\"Dot.ted\".\"under_Score\"").show()|
   | 5  | create database lower              | No     | spark.sql("use database 
lower").show()                      |
   | -  |                                    |        | spark.sql("use database 
LOWER").show()                      |
   | -  |                                    |        | spark.sql("use database 
LoWeR").show()                      |
   | 6  | create database dollar$            | No     | spark.sql("use database 
DOLLAR$").show()                    |
   | -  |                                    |        | spark.sql("use database 
dollar$").show()                    |
   | -  |                                    |        | spark.sql("use database 
DOllar$").show()                    |
   | 7  | create database "lowerq"           | Yes    | spark.sql("use database 
`\"lowerq\"`").show()               |
   
+----+------------------------------------+--------+-------------------------------------------------------------+
   ```
   The PR also populates the application parameter as an identifier to 
underlying jdbc client. 
https://docs.snowflake.com/en/user-guide/jdbc-parameters.html#application


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] manisin opened a new pull request, #6674: Add support for special characters in snowflake identifiers for Snowflake Catalog

Reply via email to