[jira] [Commented] (HIVE-27755) Quote identifiers in SQL emitted by SchemaTool for MySQL

2023-10-09 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17773173#comment-17773173
 ] 

Stamatis Zampetakis commented on HIVE-27755:


The queries just above do not come from the validate command but from the test 
itself. 

The {{TestSchemaToolForMetastore.testValidateSequences}} case sends some 
[hardcoded DML 
commands|https://github.com/apache/hive/blob/20be17d19c19a1482dc3b1b753ce1f342b940e36/standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/tools/schematool/TestSchemaToolForMetastore.java#L122]
 before/after validation. 

> Quote identifiers in SQL emitted by SchemaTool for MySQL
> 
>
> Key: HIVE-27755
> URL: https://issues.apache.org/jira/browse/HIVE-27755
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Affects Versions: 4.0.0-beta-1
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Attachments: TestMysql-upgrade-after.txt, 
> TestMysql-upgrade-before.txt, 
> TestSchemaToolForMetastore-validateSequences-after.txt, 
> TestSchemaToolForMetastore-validateSequences-before.txt, 
> TestSchemaToolForMetastore-validateTables-after.txt, 
> TestSchemaToolForMetastore-validateTables-before.txt
>
>
> Various SchemaTool options/tasks (e.g., "validate") generate and run SQL 
> statements on the underlying database. Depending on the database identifiers 
> in the SQL statements may be quoted (see 
> [https://github.com/apache/hive/blob/2dbfbeefc1a73d6a50f1c829658846fc827fc780/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/schematool/HiveSchemaHelper.java#L173]).
> Currently, all identifiers are quoted when the database is Postgres and this 
> tickets aims to do the same for MySQL/MariaDB.
> The main motivation behind this change is to avoid unexpected surprises and 
> query failures when/if the database decides to turn some of the 
> tables/columns we are using internally to reserved keywords.
> As a concrete example, the Percona fork of MySQL recently turned 
> SEQUENCE_TABLE into a reserved keyword 
> ([https://docs.percona.com/percona-server/8.0/flexibility/sequence_table.html])
>  and this comes in conflict with our internal metastore table.
> The installation scripts do not fail since in that case SEQUENCE_TABLE is 
> quoted 
> ([https://github.com/apache/hive/blob/2dbfbeefc1a73d6a50f1c829658846fc827fc780/standalone-metastore/metastore-server/src/main/sql/mysql/hive-schema-4.0.0-beta-2.mysql.sql#L447])
>  but validation queries emitted by the SchemaTool will fail 
> ([https://github.com/apache/hive/blob/2dbfbeefc1a73d6a50f1c829658846fc827fc780/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/schematool/SchemaToolTaskValidate.java#L117])
>  if we don't use quoted identifiers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-27755) Quote identifiers in SQL emitted by SchemaTool for MySQL

2023-10-08 Thread Zhihua Deng (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17773136#comment-17773136
 ] 

Zhihua Deng commented on HIVE-27755:


In the TestSchemaToolForMetastore-validateSequences-after.txt,  are these from 
the validate command?
{noformat}
2023-09-28T11:28:55.582079Z 9 Query delete from SEQUENCE_TABLE
2023-09-28T11:28:55.584104Z 9 Query insert into SEQUENCE_TABLE 
values('org.apache.hadoop.hive.metastore.model.MDatabase', 100){noformat}

> Quote identifiers in SQL emitted by SchemaTool for MySQL
> 
>
> Key: HIVE-27755
> URL: https://issues.apache.org/jira/browse/HIVE-27755
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Affects Versions: 4.0.0-beta-1
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Attachments: TestMysql-upgrade-after.txt, 
> TestMysql-upgrade-before.txt, 
> TestSchemaToolForMetastore-validateSequences-after.txt, 
> TestSchemaToolForMetastore-validateSequences-before.txt, 
> TestSchemaToolForMetastore-validateTables-after.txt, 
> TestSchemaToolForMetastore-validateTables-before.txt
>
>
> Various SchemaTool options/tasks (e.g., "validate") generate and run SQL 
> statements on the underlying database. Depending on the database identifiers 
> in the SQL statements may be quoted (see 
> [https://github.com/apache/hive/blob/2dbfbeefc1a73d6a50f1c829658846fc827fc780/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/schematool/HiveSchemaHelper.java#L173]).
> Currently, all identifiers are quoted when the database is Postgres and this 
> tickets aims to do the same for MySQL/MariaDB.
> The main motivation behind this change is to avoid unexpected surprises and 
> query failures when/if the database decides to turn some of the 
> tables/columns we are using internally to reserved keywords.
> As a concrete example, the Percona fork of MySQL recently turned 
> SEQUENCE_TABLE into a reserved keyword 
> ([https://docs.percona.com/percona-server/8.0/flexibility/sequence_table.html])
>  and this comes in conflict with our internal metastore table.
> The installation scripts do not fail since in that case SEQUENCE_TABLE is 
> quoted 
> ([https://github.com/apache/hive/blob/2dbfbeefc1a73d6a50f1c829658846fc827fc780/standalone-metastore/metastore-server/src/main/sql/mysql/hive-schema-4.0.0-beta-2.mysql.sql#L447])
>  but validation queries emitted by the SchemaTool will fail 
> ([https://github.com/apache/hive/blob/2dbfbeefc1a73d6a50f1c829658846fc827fc780/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/schematool/SchemaToolTaskValidate.java#L117])
>  if we don't use quoted identifiers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-27755) Quote identifiers in SQL emitted by SchemaTool for MySQL

2023-09-28 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17770042#comment-17770042
 ] 

Stamatis Zampetakis commented on HIVE-27755:


For testing the changes, I enabled the general_log for MySQL 
(https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_general_log)
 and run the following tests before and after the changes in PR#4757:

{noformat}
cd standalone-metastore/metastore-server
mvn test -Dtest=TestMysql#upgrade -Dtest.groups=""
mvn test -Dtest=TestSchemaToolForMetastore#testValidateSchemaTables*Mysql* 
-Dtest.groups="" (requires patch in #4754)
mvn test -Dtest=TestSchemaToolForMetastore#testValidateSequences*Mysql* 
-Dtest.groups="" (requires patch in #4754)
{noformat}

I monitored the general_log output generated by the aforementioned tests and I 
compared before and after files for each test verifying that table and column 
names are quoted as expected.

The before and after files from the general_log are attached in this JIRA.

> Quote identifiers in SQL emitted by SchemaTool for MySQL
> 
>
> Key: HIVE-27755
> URL: https://issues.apache.org/jira/browse/HIVE-27755
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Affects Versions: 4.0.0-beta-1
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Attachments: TestMysql-upgrade-after.txt, 
> TestMysql-upgrade-before.txt, 
> TestSchemaToolForMetastore-validateSequences-after.txt, 
> TestSchemaToolForMetastore-validateSequences-before.txt, 
> TestSchemaToolForMetastore-validateTables-after.txt, 
> TestSchemaToolForMetastore-validateTables-before.txt
>
>
> Various SchemaTool options/tasks (e.g., "validate") generate and run SQL 
> statements on the underlying database. Depending on the database identifiers 
> in the SQL statements may be quoted (see 
> [https://github.com/apache/hive/blob/2dbfbeefc1a73d6a50f1c829658846fc827fc780/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/schematool/HiveSchemaHelper.java#L173]).
> Currently, all identifiers are quoted when the database is Postgres and this 
> tickets aims to do the same for MySQL/MariaDB.
> The main motivation behind this change is to avoid unexpected surprises and 
> query failures when/if the database decides to turn some of the 
> tables/columns we are using internally to reserved keywords.
> As a concrete example, the Percona fork of MySQL recently turned 
> SEQUENCE_TABLE into a reserved keyword 
> ([https://docs.percona.com/percona-server/8.0/flexibility/sequence_table.html])
>  and this comes in conflict with our internal metastore table.
> The installation scripts do not fail since in that case SEQUENCE_TABLE is 
> quoted 
> ([https://github.com/apache/hive/blob/2dbfbeefc1a73d6a50f1c829658846fc827fc780/standalone-metastore/metastore-server/src/main/sql/mysql/hive-schema-4.0.0-beta-2.mysql.sql#L447])
>  but validation queries emitted by the SchemaTool will fail 
> ([https://github.com/apache/hive/blob/2dbfbeefc1a73d6a50f1c829658846fc827fc780/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/schematool/SchemaToolTaskValidate.java#L117])
>  if we don't use quoted identifiers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)