[jira] [Commented] (HIVE-27755) Quote identifiers in SQL emitted by SchemaTool for MySQL
[ https://issues.apache.org/jira/browse/HIVE-27755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17773173#comment-17773173 ] Stamatis Zampetakis commented on HIVE-27755: The queries just above do not come from the validate command but from the test itself. The {{TestSchemaToolForMetastore.testValidateSequences}} case sends some [hardcoded DML commands|https://github.com/apache/hive/blob/20be17d19c19a1482dc3b1b753ce1f342b940e36/standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/tools/schematool/TestSchemaToolForMetastore.java#L122] before/after validation. > Quote identifiers in SQL emitted by SchemaTool for MySQL > > > Key: HIVE-27755 > URL: https://issues.apache.org/jira/browse/HIVE-27755 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Affects Versions: 4.0.0-beta-1 >Reporter: Stamatis Zampetakis >Assignee: Stamatis Zampetakis >Priority: Major > Labels: pull-request-available > Attachments: TestMysql-upgrade-after.txt, > TestMysql-upgrade-before.txt, > TestSchemaToolForMetastore-validateSequences-after.txt, > TestSchemaToolForMetastore-validateSequences-before.txt, > TestSchemaToolForMetastore-validateTables-after.txt, > TestSchemaToolForMetastore-validateTables-before.txt > > > Various SchemaTool options/tasks (e.g., "validate") generate and run SQL > statements on the underlying database. Depending on the database identifiers > in the SQL statements may be quoted (see > [https://github.com/apache/hive/blob/2dbfbeefc1a73d6a50f1c829658846fc827fc780/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/schematool/HiveSchemaHelper.java#L173]). > Currently, all identifiers are quoted when the database is Postgres and this > tickets aims to do the same for MySQL/MariaDB. > The main motivation behind this change is to avoid unexpected surprises and > query failures when/if the database decides to turn some of the > tables/columns we are using internally to reserved keywords. > As a concrete example, the Percona fork of MySQL recently turned > SEQUENCE_TABLE into a reserved keyword > ([https://docs.percona.com/percona-server/8.0/flexibility/sequence_table.html]) > and this comes in conflict with our internal metastore table. > The installation scripts do not fail since in that case SEQUENCE_TABLE is > quoted > ([https://github.com/apache/hive/blob/2dbfbeefc1a73d6a50f1c829658846fc827fc780/standalone-metastore/metastore-server/src/main/sql/mysql/hive-schema-4.0.0-beta-2.mysql.sql#L447]) > but validation queries emitted by the SchemaTool will fail > ([https://github.com/apache/hive/blob/2dbfbeefc1a73d6a50f1c829658846fc827fc780/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/schematool/SchemaToolTaskValidate.java#L117]) > if we don't use quoted identifiers. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-27755) Quote identifiers in SQL emitted by SchemaTool for MySQL
[ https://issues.apache.org/jira/browse/HIVE-27755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17773136#comment-17773136 ] Zhihua Deng commented on HIVE-27755: In the TestSchemaToolForMetastore-validateSequences-after.txt, are these from the validate command? {noformat} 2023-09-28T11:28:55.582079Z 9 Query delete from SEQUENCE_TABLE 2023-09-28T11:28:55.584104Z 9 Query insert into SEQUENCE_TABLE values('org.apache.hadoop.hive.metastore.model.MDatabase', 100){noformat} > Quote identifiers in SQL emitted by SchemaTool for MySQL > > > Key: HIVE-27755 > URL: https://issues.apache.org/jira/browse/HIVE-27755 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Affects Versions: 4.0.0-beta-1 >Reporter: Stamatis Zampetakis >Assignee: Stamatis Zampetakis >Priority: Major > Labels: pull-request-available > Attachments: TestMysql-upgrade-after.txt, > TestMysql-upgrade-before.txt, > TestSchemaToolForMetastore-validateSequences-after.txt, > TestSchemaToolForMetastore-validateSequences-before.txt, > TestSchemaToolForMetastore-validateTables-after.txt, > TestSchemaToolForMetastore-validateTables-before.txt > > > Various SchemaTool options/tasks (e.g., "validate") generate and run SQL > statements on the underlying database. Depending on the database identifiers > in the SQL statements may be quoted (see > [https://github.com/apache/hive/blob/2dbfbeefc1a73d6a50f1c829658846fc827fc780/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/schematool/HiveSchemaHelper.java#L173]). > Currently, all identifiers are quoted when the database is Postgres and this > tickets aims to do the same for MySQL/MariaDB. > The main motivation behind this change is to avoid unexpected surprises and > query failures when/if the database decides to turn some of the > tables/columns we are using internally to reserved keywords. > As a concrete example, the Percona fork of MySQL recently turned > SEQUENCE_TABLE into a reserved keyword > ([https://docs.percona.com/percona-server/8.0/flexibility/sequence_table.html]) > and this comes in conflict with our internal metastore table. > The installation scripts do not fail since in that case SEQUENCE_TABLE is > quoted > ([https://github.com/apache/hive/blob/2dbfbeefc1a73d6a50f1c829658846fc827fc780/standalone-metastore/metastore-server/src/main/sql/mysql/hive-schema-4.0.0-beta-2.mysql.sql#L447]) > but validation queries emitted by the SchemaTool will fail > ([https://github.com/apache/hive/blob/2dbfbeefc1a73d6a50f1c829658846fc827fc780/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/schematool/SchemaToolTaskValidate.java#L117]) > if we don't use quoted identifiers. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-27755) Quote identifiers in SQL emitted by SchemaTool for MySQL
[ https://issues.apache.org/jira/browse/HIVE-27755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17770042#comment-17770042 ] Stamatis Zampetakis commented on HIVE-27755: For testing the changes, I enabled the general_log for MySQL (https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_general_log) and run the following tests before and after the changes in PR#4757: {noformat} cd standalone-metastore/metastore-server mvn test -Dtest=TestMysql#upgrade -Dtest.groups="" mvn test -Dtest=TestSchemaToolForMetastore#testValidateSchemaTables*Mysql* -Dtest.groups="" (requires patch in #4754) mvn test -Dtest=TestSchemaToolForMetastore#testValidateSequences*Mysql* -Dtest.groups="" (requires patch in #4754) {noformat} I monitored the general_log output generated by the aforementioned tests and I compared before and after files for each test verifying that table and column names are quoted as expected. The before and after files from the general_log are attached in this JIRA. > Quote identifiers in SQL emitted by SchemaTool for MySQL > > > Key: HIVE-27755 > URL: https://issues.apache.org/jira/browse/HIVE-27755 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Affects Versions: 4.0.0-beta-1 >Reporter: Stamatis Zampetakis >Assignee: Stamatis Zampetakis >Priority: Major > Labels: pull-request-available > Attachments: TestMysql-upgrade-after.txt, > TestMysql-upgrade-before.txt, > TestSchemaToolForMetastore-validateSequences-after.txt, > TestSchemaToolForMetastore-validateSequences-before.txt, > TestSchemaToolForMetastore-validateTables-after.txt, > TestSchemaToolForMetastore-validateTables-before.txt > > > Various SchemaTool options/tasks (e.g., "validate") generate and run SQL > statements on the underlying database. Depending on the database identifiers > in the SQL statements may be quoted (see > [https://github.com/apache/hive/blob/2dbfbeefc1a73d6a50f1c829658846fc827fc780/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/schematool/HiveSchemaHelper.java#L173]). > Currently, all identifiers are quoted when the database is Postgres and this > tickets aims to do the same for MySQL/MariaDB. > The main motivation behind this change is to avoid unexpected surprises and > query failures when/if the database decides to turn some of the > tables/columns we are using internally to reserved keywords. > As a concrete example, the Percona fork of MySQL recently turned > SEQUENCE_TABLE into a reserved keyword > ([https://docs.percona.com/percona-server/8.0/flexibility/sequence_table.html]) > and this comes in conflict with our internal metastore table. > The installation scripts do not fail since in that case SEQUENCE_TABLE is > quoted > ([https://github.com/apache/hive/blob/2dbfbeefc1a73d6a50f1c829658846fc827fc780/standalone-metastore/metastore-server/src/main/sql/mysql/hive-schema-4.0.0-beta-2.mysql.sql#L447]) > but validation queries emitted by the SchemaTool will fail > ([https://github.com/apache/hive/blob/2dbfbeefc1a73d6a50f1c829658846fc827fc780/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/schematool/SchemaToolTaskValidate.java#L117]) > if we don't use quoted identifiers. -- This message was sent by Atlassian Jira (v8.20.10#820010)