from:"Zhihua Deng \(JIRA\)"

[jira] [Resolved] (HIVE-28316) The documentation provides an ambiguous explanation regarding the mutually exclusive nature of `STORED BY` and `STORED AS`

2024-07-10 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng resolved HIVE-28316.

Fix Version/s: Not Applicable
   Resolution: Fixed

Updated the document. Thank you for the issue report, [~linghengqian]!

> The documentation provides an ambiguous explanation regarding the mutually 
> exclusive nature of `STORED BY` and `STORED AS`
> --
>
> Key: HIVE-28316
> URL: https://issues.apache.org/jira/browse/HIVE-28316
> Project: Hive
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Qiheng He
>Assignee: Zhihua Deng
>Priority: Major
> Fix For: Not Applicable
>
>
> - The documentation provides an ambiguous explanation regarding the mutually 
> exclusive nature of {*}STORED BY{*} and {*}STORED AS{*}.
> - As mentioned on 
> https://cwiki.apache.org/confluence/display/Hive/StorageHandlers , when the 
> {*}CREATE TABLE{*} statement specifies {*}STORED BY{*}, it should not also 
> specify {*}STORED AS{*}. The content in question is as follows.
> {code:bash}
> When STORED BY is specified, then row_format (DELIMITED or SERDE) and STORED 
> AS cannot be specified. Optional SERDEPROPERTIES can be specified as part of 
> the STORED BY clause and will be passed to the serde provided by the storage 
> handler.
> See CREATE TABLE and Row Format, Storage Format, and SerDe for more 
> information.
> Example:
> CREATE TABLE hbase_table_1(key int, value string)
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES (
> "hbase.columns.mapping" = "cf:string",
> "hbase.table.name" = "hbase_table_0"
> );
> {code}
> - This is similarly reflected in the documentation at 
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL , where 
> {*}|{*} separates {*}STORED BY{*} from {*}STORED AS{*}, indicating their 
> distinct usage and mutual exclusivity.
> {code:bash}
> [
>[ROW FORMAT row_format] 
>[STORED AS file_format]
>  | STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)]  
> -- (Note: Available in Hive 0.6.0 and later)
> ]
> {code}
> - However, this contradicts the information provided in the Hive-Iceberg 
> Integration documentation at 
> https://cwiki.apache.org/confluence/display/Hive/Hive-Iceberg+Integration , 
> which explicitly gives examples demonstrating that {*}STORED BY{*} can 
> coexist with {*}STORED AS{*}. This creates an ambiguous interpretation. 
> {code:bash}
> The iceberg table currently supports three file formats: PARQUET, ORC & AVRO. 
> The default file format is Parquet. The file format can be explicitily 
> provided by using STORED AS  while creating the table
> Example-1:
> CREATE TABLE ORC_TABLE (ID INT) STORED BY ICEBERG STORED AS ORC;
> {code}
> - Further early discussions on this topic can be found at 
> https://github.com/apache/shardingsphere/pull/31526 .



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HIVE-28338) Client connection count is not correct in HiveMetaStore#close

2024-06-28 Thread Zhihua Deng (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-28338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17860919#comment-17860919
 ] 

Zhihua Deng commented on HIVE-28338:


Fix has been merged. Thank you for the contribution, [~wechar]!

> Client connection count is not correct in HiveMetaStore#close
> -
>
> Key: HIVE-28338
> URL: https://issues.apache.org/jira/browse/HIVE-28338
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Wechar
>Assignee: Wechar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.1.0
>
>
> HIVE-24349 introduced a bug in {{HiveMetaStoreClient}} for embedded 
> metastore, where the log would print negative connection counts.
> *Root Cause*
> Connection count is only used in remote metastore, we do not need decrease 
> connection counts when transport is null.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (HIVE-28338) Client connection count is not correct in HiveMetaStore#close

2024-06-28 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng resolved HIVE-28338.

Fix Version/s: 4.1.0
   Resolution: Fixed

> Client connection count is not correct in HiveMetaStore#close
> -
>
> Key: HIVE-28338
> URL: https://issues.apache.org/jira/browse/HIVE-28338
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Wechar
>Assignee: Wechar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.1.0
>
>
> HIVE-24349 introduced a bug in {{HiveMetaStoreClient}} for embedded 
> metastore, where the log would print negative connection counts.
> *Root Cause*
> Connection count is only used in remote metastore, we do not need decrease 
> connection counts when transport is null.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HIVE-28352) Schematool fails to upgradeSchema on dbType=hive

2024-06-27 Thread Zhihua Deng (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-28352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17860366#comment-17860366
 ] 

Zhihua Deng commented on HIVE-28352:


Thank you [~okumin] for the issue and PR, marked this Jira with 4.0.1-must.

> Schematool fails to upgradeSchema on dbType=hive
> 
>
> Key: HIVE-28352
> URL: https://issues.apache.org/jira/browse/HIVE-28352
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Shohei Okumiya
>Assignee: Shohei Okumiya
>Priority: Major
>  Labels: hive-4.0.1-must
>
> Schematool tries to refer to incorrect file names.
> {code:java}
> $ schematool -metaDbType derby -dbType hive -initSchemaTo 3.0.0 -url 
> jdbc:hive2://hive-hiveserver2:1/default -driver 
> org.apache.hive.jdbc.HiveDriver
> $ schematool -metaDbType derby -dbType hive -upgradeSchema -url 
> jdbc:hive2://hive-hiveserver2:1/default -driver 
> org.apache.hive.jdbc.HiveDriver
> ...
> Completed upgrade-3.0.0-to-3.1.0.hive.sql
> Upgrade script upgrade-3.1.0-to-4.0.0-alpha-1.hive.hive.sql
> 2024-06-27T01:41:46,572 ERROR [main] schematool.MetastoreSchemaTool: Upgrade 
> FAILED! Metastore state would be inconsistent !!
> Upgrade FAILED! Metastore state would be inconsistent !!
> 2024-06-27T01:41:46,572 ERROR [main] schematool.MetastoreSchemaTool: 
> Underlying cause: java.io.FileNotFoundException : 
> /opt/hive/scripts/metastore/upgrade/hive/upgrade-3.1.0-to-4.0.0-alpha-1.hive.hive.sql
>  (No such file or directory)
> Underlying cause: java.io.FileNotFoundException : 
> /opt/hive/scripts/metastore/upgrade/hive/upgrade-3.1.0-to-4.0.0-alpha-1.hive.hive.sql
>  (No such file or directory)
> 2024-06-27T01:41:46,572 ERROR [main] schematool.MetastoreSchemaTool: Use 
> --verbose for detailed stacktrace.
> Use --verbose for detailed stacktrace.
> 2024-06-27T01:41:46,573 ERROR [main] schematool.MetastoreSchemaTool: *** 
> schemaTool failed ***
> *** schemaTool failed *** {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-28352) Schematool fails to upgradeSchema on dbType=hive

2024-06-27 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-28352:
---
Labels: hive-4.0.1-must  (was: )

> Schematool fails to upgradeSchema on dbType=hive
> 
>
> Key: HIVE-28352
> URL: https://issues.apache.org/jira/browse/HIVE-28352
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Shohei Okumiya
>Assignee: Shohei Okumiya
>Priority: Major
>  Labels: hive-4.0.1-must
>
> Schematool tries to refer to incorrect file names.
> {code:java}
> $ schematool -metaDbType derby -dbType hive -initSchemaTo 3.0.0 -url 
> jdbc:hive2://hive-hiveserver2:1/default -driver 
> org.apache.hive.jdbc.HiveDriver
> $ schematool -metaDbType derby -dbType hive -upgradeSchema -url 
> jdbc:hive2://hive-hiveserver2:1/default -driver 
> org.apache.hive.jdbc.HiveDriver
> ...
> Completed upgrade-3.0.0-to-3.1.0.hive.sql
> Upgrade script upgrade-3.1.0-to-4.0.0-alpha-1.hive.hive.sql
> 2024-06-27T01:41:46,572 ERROR [main] schematool.MetastoreSchemaTool: Upgrade 
> FAILED! Metastore state would be inconsistent !!
> Upgrade FAILED! Metastore state would be inconsistent !!
> 2024-06-27T01:41:46,572 ERROR [main] schematool.MetastoreSchemaTool: 
> Underlying cause: java.io.FileNotFoundException : 
> /opt/hive/scripts/metastore/upgrade/hive/upgrade-3.1.0-to-4.0.0-alpha-1.hive.hive.sql
>  (No such file or directory)
> Underlying cause: java.io.FileNotFoundException : 
> /opt/hive/scripts/metastore/upgrade/hive/upgrade-3.1.0-to-4.0.0-alpha-1.hive.hive.sql
>  (No such file or directory)
> 2024-06-27T01:41:46,572 ERROR [main] schematool.MetastoreSchemaTool: Use 
> --verbose for detailed stacktrace.
> Use --verbose for detailed stacktrace.
> 2024-06-27T01:41:46,573 ERROR [main] schematool.MetastoreSchemaTool: *** 
> schemaTool failed ***
> *** schemaTool failed *** {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (HIVE-28205) Implement direct sql for get_partitions_ps_with_auth api

2024-06-18 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng resolved HIVE-28205.

Fix Version/s: 4.1.0
   Resolution: Fixed

Fix has been merged. Thank you [~wechar] for the PR, and [~zhangbutao] for the 
review!

> Implement direct sql for get_partitions_ps_with_auth api
> 
>
> Key: HIVE-28205
> URL: https://issues.apache.org/jira/browse/HIVE-28205
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Wechar
>Assignee: Wechar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.1.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HIVE-28317) Support HiveServer2 JDBC Driver's `initFile` parameter to directly read SQL files on the classpath

2024-06-15 Thread Zhihua Deng (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-28317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17855271#comment-17855271
 ] 

Zhihua Deng commented on HIVE-28317:


Do we see other databases support this use case? does {*}{*}{*}classpath{*} 
refer to the current directory where the process is started?

> Support HiveServer2 JDBC Driver's `initFile` parameter to directly read SQL 
> files on the classpath
> --
>
> Key: HIVE-28317
> URL: https://issues.apache.org/jira/browse/HIVE-28317
> Project: Hive
>  Issue Type: Wish
>Reporter: Qiheng He
>Priority: Major
>
> - Support HiveServer2 JDBC Driver's {*}initFile{*} parameter to directly read 
> SQL files on the classpath.
> - In https://issues.apache.org/jira/browse/HIVE-5867 , the {*}initFile{*} 
> parameter of HiveServer2 JDBC Driver is supported to read SQL files on 
> absolute paths. However, this jdbcUrl parameter does not natively support 
> paths on the {*}classpath{*}. This necessitates executing additional steps if 
> one needs to read a file located on the {*}classpath{*}.
> {code:java}
> import com.zaxxer.hikari.HikariConfig;
> import com.zaxxer.hikari.HikariDataSource;
> import org.awaitility.Awaitility;
> import org.junit.jupiter.api.Test;
> import java.nio.file.Paths;
> import java.time.Duration;
> public class ExampleTest {
> @Test
> void test() {
> String absolutePath = 
> Paths.get("src/test/resources/test-sql/test-databases-hive.sql")
> .toAbsolutePath().normalize().toString();
> HikariConfig config = new HikariConfig();
> config.setDriverClassName("org.apache.hive.jdbc.HiveDriver");
> config.setJdbcUrl("jdbc:hive2://localhost:1;initFile=" + 
> absolutePath);
> try (HikariDataSource hikariDataSource = new 
> HikariDataSource(config)) {
> 
> Awaitility.await().atMost(Duration.ofMinutes(1L)).ignoreExceptions().until(() 
> -> {
> hikariDataSource.getConnection().close();
> return true;
> });
> }
> }
> }
> {code}
> - It would greatly facilitate integration testing scenarios, such as those 
> with {*}testcontainers-java{*}, if the {*}initFile{*} parameter could 
> recognize classpath files by identifying a {*}classpath{*} prefix. For 
> instance, a JDBC URL like 
> {*}jdbc:hive2://localhost:1;initFile=classpath:test-sql/test-databases-hive.sql{*}
>  would become much more useful.
> - Further early discussions on this topic can be found at 
> https://github.com/apache/shardingsphere/pull/31526 .



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HIVE-28316) The documentation provides an ambiguous explanation regarding the mutually exclusive nature of `STORED BY` and `STORED AS`

2024-06-15 Thread Zhihua Deng (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-28316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17855270#comment-17855270
 ] 

Zhihua Deng commented on HIVE-28316:


The document is somehow out-dated, let me fix it. Thank you for pointing it 
out, [~linghengqian] !

> The documentation provides an ambiguous explanation regarding the mutually 
> exclusive nature of `STORED BY` and `STORED AS`
> --
>
> Key: HIVE-28316
> URL: https://issues.apache.org/jira/browse/HIVE-28316
> Project: Hive
>  Issue Type: Bug
>Reporter: Qiheng He
>Priority: Major
>
> - The documentation provides an ambiguous explanation regarding the mutually 
> exclusive nature of {*}STORED BY{*} and {*}STORED AS{*}.
> - As mentioned on 
> https://cwiki.apache.org/confluence/display/Hive/StorageHandlers , when the 
> {*}CREATE TABLE{*} statement specifies {*}STORED BY{*}, it should not also 
> specify {*}STORED AS{*}. The content in question is as follows.
> {code:bash}
> When STORED BY is specified, then row_format (DELIMITED or SERDE) and STORED 
> AS cannot be specified. Optional SERDEPROPERTIES can be specified as part of 
> the STORED BY clause and will be passed to the serde provided by the storage 
> handler.
> See CREATE TABLE and Row Format, Storage Format, and SerDe for more 
> information.
> Example:
> CREATE TABLE hbase_table_1(key int, value string)
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES (
> "hbase.columns.mapping" = "cf:string",
> "hbase.table.name" = "hbase_table_0"
> );
> {code}
> - This is similarly reflected in the documentation at 
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL , where 
> {*}|{*} separates {*}STORED BY{*} from {*}STORED AS{*}, indicating their 
> distinct usage and mutual exclusivity.
> {code:bash}
> [
>[ROW FORMAT row_format] 
>[STORED AS file_format]
>  | STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)]  
> -- (Note: Available in Hive 0.6.0 and later)
> ]
> {code}
> - However, this contradicts the information provided in the Hive-Iceberg 
> Integration documentation at 
> https://cwiki.apache.org/confluence/display/Hive/Hive-Iceberg+Integration , 
> which explicitly gives examples demonstrating that {*}STORED BY{*} can 
> coexist with {*}STORED AS{*}. This creates an ambiguous interpretation. 
> {code:bash}
> The iceberg table currently supports three file formats: PARQUET, ORC & AVRO. 
> The default file format is Parquet. The file format can be explicitily 
> provided by using STORED AS  while creating the table
> Example-1:
> CREATE TABLE ORC_TABLE (ID INT) STORED BY ICEBERG STORED AS ORC;
> {code}
> - Further early discussions on this topic can be found at 
> https://github.com/apache/shardingsphere/pull/31526 .



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (HIVE-28316) The documentation provides an ambiguous explanation regarding the mutually exclusive nature of `STORED BY` and `STORED AS`

2024-06-15 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng reassigned HIVE-28316:
--

Assignee: Zhihua Deng

> The documentation provides an ambiguous explanation regarding the mutually 
> exclusive nature of `STORED BY` and `STORED AS`
> --
>
> Key: HIVE-28316
> URL: https://issues.apache.org/jira/browse/HIVE-28316
> Project: Hive
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Qiheng He
>Assignee: Zhihua Deng
>Priority: Major
>
> - The documentation provides an ambiguous explanation regarding the mutually 
> exclusive nature of {*}STORED BY{*} and {*}STORED AS{*}.
> - As mentioned on 
> https://cwiki.apache.org/confluence/display/Hive/StorageHandlers , when the 
> {*}CREATE TABLE{*} statement specifies {*}STORED BY{*}, it should not also 
> specify {*}STORED AS{*}. The content in question is as follows.
> {code:bash}
> When STORED BY is specified, then row_format (DELIMITED or SERDE) and STORED 
> AS cannot be specified. Optional SERDEPROPERTIES can be specified as part of 
> the STORED BY clause and will be passed to the serde provided by the storage 
> handler.
> See CREATE TABLE and Row Format, Storage Format, and SerDe for more 
> information.
> Example:
> CREATE TABLE hbase_table_1(key int, value string)
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES (
> "hbase.columns.mapping" = "cf:string",
> "hbase.table.name" = "hbase_table_0"
> );
> {code}
> - This is similarly reflected in the documentation at 
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL , where 
> {*}|{*} separates {*}STORED BY{*} from {*}STORED AS{*}, indicating their 
> distinct usage and mutual exclusivity.
> {code:bash}
> [
>[ROW FORMAT row_format] 
>[STORED AS file_format]
>  | STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)]  
> -- (Note: Available in Hive 0.6.0 and later)
> ]
> {code}
> - However, this contradicts the information provided in the Hive-Iceberg 
> Integration documentation at 
> https://cwiki.apache.org/confluence/display/Hive/Hive-Iceberg+Integration , 
> which explicitly gives examples demonstrating that {*}STORED BY{*} can 
> coexist with {*}STORED AS{*}. This creates an ambiguous interpretation. 
> {code:bash}
> The iceberg table currently supports three file formats: PARQUET, ORC & AVRO. 
> The default file format is Parquet. The file format can be explicitily 
> provided by using STORED AS  while creating the table
> Example-1:
> CREATE TABLE ORC_TABLE (ID INT) STORED BY ICEBERG STORED AS ORC;
> {code}
> - Further early discussions on this topic can be found at 
> https://github.com/apache/shardingsphere/pull/31526 .



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-28316) The documentation provides an ambiguous explanation regarding the mutually exclusive nature of `STORED BY` and `STORED AS`

2024-06-15 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-28316:
---
Component/s: Documentation

> The documentation provides an ambiguous explanation regarding the mutually 
> exclusive nature of `STORED BY` and `STORED AS`
> --
>
> Key: HIVE-28316
> URL: https://issues.apache.org/jira/browse/HIVE-28316
> Project: Hive
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Qiheng He
>Priority: Major
>
> - The documentation provides an ambiguous explanation regarding the mutually 
> exclusive nature of {*}STORED BY{*} and {*}STORED AS{*}.
> - As mentioned on 
> https://cwiki.apache.org/confluence/display/Hive/StorageHandlers , when the 
> {*}CREATE TABLE{*} statement specifies {*}STORED BY{*}, it should not also 
> specify {*}STORED AS{*}. The content in question is as follows.
> {code:bash}
> When STORED BY is specified, then row_format (DELIMITED or SERDE) and STORED 
> AS cannot be specified. Optional SERDEPROPERTIES can be specified as part of 
> the STORED BY clause and will be passed to the serde provided by the storage 
> handler.
> See CREATE TABLE and Row Format, Storage Format, and SerDe for more 
> information.
> Example:
> CREATE TABLE hbase_table_1(key int, value string)
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES (
> "hbase.columns.mapping" = "cf:string",
> "hbase.table.name" = "hbase_table_0"
> );
> {code}
> - This is similarly reflected in the documentation at 
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL , where 
> {*}|{*} separates {*}STORED BY{*} from {*}STORED AS{*}, indicating their 
> distinct usage and mutual exclusivity.
> {code:bash}
> [
>[ROW FORMAT row_format] 
>[STORED AS file_format]
>  | STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)]  
> -- (Note: Available in Hive 0.6.0 and later)
> ]
> {code}
> - However, this contradicts the information provided in the Hive-Iceberg 
> Integration documentation at 
> https://cwiki.apache.org/confluence/display/Hive/Hive-Iceberg+Integration , 
> which explicitly gives examples demonstrating that {*}STORED BY{*} can 
> coexist with {*}STORED AS{*}. This creates an ambiguous interpretation. 
> {code:bash}
> The iceberg table currently supports three file formats: PARQUET, ORC & AVRO. 
> The default file format is Parquet. The file format can be explicitily 
> provided by using STORED AS  while creating the table
> Example-1:
> CREATE TABLE ORC_TABLE (ID INT) STORED BY ICEBERG STORED AS ORC;
> {code}
> - Further early discussions on this topic can be found at 
> https://github.com/apache/shardingsphere/pull/31526 .



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HIVE-28308) The module `org.apache.hive:hive-jdbc:4.0.0` unintuitively removed all dependencies under the Compile Scope

2024-06-15 Thread Zhihua Deng (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-28308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17855269#comment-17855269
 ] 

Zhihua Deng commented on HIVE-28308:


[~linghengqian] Thank you for the input, 

Regarding "Furthermore, the Uber JAR can complicate building GraalVM Native 
Images, necessitating additional GraalVM Reachability Metadata due to its 
inclusive nature."

Could you please give more details why the uber jar could introduce the trouble 
to the building? is it because of the big size? Apart from the class conflict, 
is there any other reason why org.apache.hive:hive-jdbc:4.0.0 is preferable 
than the standalone jar?

Thanks,

Zhihua

> The module `org.apache.hive:hive-jdbc:4.0.0` unintuitively removed all 
> dependencies under the Compile Scope
> ---
>
> Key: HIVE-28308
> URL: https://issues.apache.org/jira/browse/HIVE-28308
> Project: Hive
>  Issue Type: Bug
>Reporter: Qiheng He
>Priority: Major
>
> - The module *org.apache.hive:hive-jdbc:4.0.0* unintuitively removed all 
> dependencies under the Compile Scope.
>  - Comparing the dependencies listed at 
> [https://central.sonatype.com/artifact/org.apache.hive/hive-jdbc/4.0.0/dependencies]
>  with those at 
> [https://central.sonatype.com/artifact/org.apache.hive/hive-jdbc/3.1.3/dependencies]
>  , it becomes apparent that *org.apache.hive:hive-jdbc:4.0.0* includes only 
> test dependencies. This results in the need to manually add additional 
> dependencies when utilizing the HiveServer2 JDBC Driver.
>  - 
> {code:xml}
> 
>   org.apache.hive
>   hive-jdbc
>   4.0.0
> 
> 
>   org.apache.hive
>   hive-service
>   4.0.0
> 
>   
> org.apache.logging.log4j
> log4j-slf4j-impl
>   
>   
> org.slf4j
> slf4j-reload4j
>   
>   
> org.apache.logging.log4j
> log4j-web
>   
>   
> org.apache.logging.log4j
> log4j-core
>   
>   
> org.apache.logging.log4j
> log4j-api
>   
>   
> org.apache.logging.log4j
> log4j-1.2-api
>   
>   
> 
> 
>   org.apache.hadoop
>   hadoop-client-runtime
>   3.3.6
> 
> {code}
>  - More early surveys come from 
> https://github.com/apache/shardingSphere/pull/31526 and 
> https://github.com/dbeaver/dbeaver/issues/22777 . I personally think this is 
> not a reasonable phenomenon.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-28278) Iceberg: Stats: IllegalStateException Invalid file: file length 0

2024-06-15 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-28278:
---
Labels: hive-4.0.1-merged hive-4.0.1-must pull-request-available  (was: 
hive-4.0.1-must pull-request-available)

> Iceberg: Stats: IllegalStateException Invalid file: file length 0
> -
>
> Key: HIVE-28278
> URL: https://issues.apache.org/jira/browse/HIVE-28278
> Project: Hive
>  Issue Type: Bug
>  Components: Iceberg integration
>Affects Versions: 4.0.0
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
>  Labels: hive-4.0.1-merged, hive-4.0.1-must, 
> pull-request-available
> Fix For: 4.1.0
>
>
> BugFix, can happen when the stats file was already created but stats object 
> has not yet been written, and someone tried to read it.
> Why are the changes needed?
> {code}
> ERROR : FAILED: IllegalStateException Invalid file: file length 0 is less tha 
> minimal length of the footer tail 12
> java.lang.IllegalStateException: Invalid file: file length 0 is less tha 
> minimal length of the footer tail 12
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (HIVE-28315) Missing classes while using hive jdbc standalone jar

2024-06-15 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng reassigned HIVE-28315:
--

Assignee: Zhihua Deng

> Missing classes while using hive jdbc standalone jar
> 
>
> Key: HIVE-28315
> URL: https://issues.apache.org/jira/browse/HIVE-28315
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: hive-4.0.1-must
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-28270) Fix missing partition paths bug on drop_database

2024-06-15 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-28270:
---
Labels: hive-4.0.1-merged hive-4.0.1-must pull-request-available  (was: 
hive-4.0.1-must pull-request-available)

> Fix missing partition paths  bug on drop_database
> -
>
> Key: HIVE-28270
> URL: https://issues.apache.org/jira/browse/HIVE-28270
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Wechar
>Assignee: Wechar
>Priority: Major
>  Labels: hive-4.0.1-merged, hive-4.0.1-must, 
> pull-request-available
> Fix For: 4.1.0
>
>
> In {{HMSHandler#drop_database_core}}, it needs to collect all partition paths 
> that were not in the subdirectory of the table path, but now it only fetch 
> the last batch of paths.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-28271) DirectSql fails for AlterPartitions

2024-06-15 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-28271:
---
Labels: hive-4.0.1-merged hive-4.0.1-must pull-request-available  (was: 
hive-4.0.1-must pull-request-available)

> DirectSql fails for AlterPartitions
> ---
>
> Key: HIVE-28271
> URL: https://issues.apache.org/jira/browse/HIVE-28271
> Project: Hive
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: hive-4.0.1-merged, hive-4.0.1-must, 
> pull-request-available
> Fix For: 4.1.0
>
>
> It fails at three places: (Misses Database Which Uses CLOB & Missing Boolean 
> type conversions Checks
> *First:*
> {noformat}
> 2024-05-21T08:50:16,570  WARN [main] metastore.ObjectStore: Falling back to 
> ORM path due to direct SQL failure (this is not an error): 
> java.lang.ClassCastException: org.apache.derby.impl.jdbc.EmbedClob cannot be 
> cast to java.lang.String at 
> org.apache.hadoop.hive.metastore.ExceptionHandler.newMetaException(ExceptionHandler.java:152)
>  at org.apache.hadoop.hive.metastore.Batchable.runBatched(Batchable.java:92) 
> at 
> org.apache.hadoop.hive.metastore.DirectSqlUpdatePart.getParams(DirectSqlUpdatePart.java:748)
>  at 
> org.apache.hadoop.hive.metastore.DirectSqlUpdatePart.updateParamTableInBatch(DirectSqlUpdatePart.java:715)
>  at 
> org.apache.hadoop.hive.metastore.DirectSqlUpdatePart.alterPartitions(DirectSqlUpdatePart.java:636)
>  at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.alterPartitions(MetaStoreDirectSql.java:599)
>  at 
> org.apache.hadoop.hive.metastore.ObjectStore$20.getSqlResult(ObjectStore.java:5371);
> {noformat}
> *Second:*
> {noformat}
> 2024-05-21T09:14:36,808  WARN [main] metastore.ObjectStore: Falling back to 
> ORM path due to direct SQL failure (this is not an error): 
> java.lang.ClassCastException: org.apache.derby.impl.jdbc.EmbedClob cannot be 
> cast to java.lang.String at 
> org.apache.hadoop.hive.metastore.ExceptionHandler.newMetaException(ExceptionHandler.java:152)
>  at org.apache.hadoop.hive.metastore.Batchable.runBatched(Batchable.java:92) 
> at 
> org.apache.hadoop.hive.metastore.DirectSqlUpdatePart.updateCDInBatch(DirectSqlUpdatePart.java:1228)
>  at 
> org.apache.hadoop.hive.metastore.DirectSqlUpdatePart.updateStorageDescriptorInBatch(DirectSqlUpdatePart.java:888)
>  at 
> org.apache.hadoop.hive.metastore.DirectSqlUpdatePart.alterPartitions(DirectSqlUpdatePart.java:638)
>  at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.alterPartitions(MetaStoreDirectSql.java:599)
>  at 
> org.apache.hadoop.hive.metastore.ObjectStore$20.getSqlResult(ObjectStore.java:5371);{noformat}
> *Third: Missing Boolean check type*
> {noformat}
> 2024-05-21T09:35:44,063  WARN [main] metastore.ObjectStore: Falling back to 
> ORM path due to direct SQL failure (this is not an error): 
> java.sql.BatchUpdateException: A truncation error was encountered trying to 
> shrink CHAR 'false' to length 1. at 
> org.apache.hadoop.hive.metastore.ExceptionHandler.newMetaException(ExceptionHandler.java:152)
>  at org.apache.hadoop.hive.metastore.Batchable.runBatched(Batchable.java:92) 
> at 
> org.apache.hadoop.hive.metastore.DirectSqlUpdatePart.lambda$updateSDInBatch$16(DirectSqlUpdatePart.java:926)
>  at 
> org.apache.hadoop.hive.metastore.DirectSqlUpdatePart.updateWithStatement(DirectSqlUpdatePart.java:656)
>  at 
> org.apache.hadoop.hive.metastore.DirectSqlUpdatePart.updateSDInBatch(DirectSqlUpdatePart.java:926)
>  at 
> org.apache.hadoop.hive.metastore.DirectSqlUpdatePart.updateStorageDescriptorInBatch(DirectSqlUpdatePart.java:900)
>  at 
> org.apache.hadoop.hive.metastore.DirectSqlUpdatePart.alterPartitions(DirectSqlUpdatePart.java:638)
>  at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.alterPartitions(MetaStoreDirectSql.java:599)
>  at 
> org.apache.hadoop.hive.metastore.ObjectStore$20.getSqlResult(ObjectStore.java:5371);
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-28260) CreateTableEvent wrongly skips authorizing DFS_URI for managed table

2024-06-15 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-28260:
---
Labels: hive-4.0.1-merged hive-4.0.1-must pull-request-available  (was: 
hive-4.0.1-must pull-request-available)

> CreateTableEvent wrongly skips authorizing DFS_URI for managed table 
> -
>
> Key: HIVE-28260
> URL: https://issues.apache.org/jira/browse/HIVE-28260
> Project: Hive
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: hive-4.0.1-merged, hive-4.0.1-must, 
> pull-request-available
> Fix For: 4.1.0
>
>
> HIVE-27525 eased out permissions for external table but it wrongly eased out 
> for managed tables as well by wrong check for managed tables



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-28211) Restore hive-exec-core jar

2024-06-15 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-28211:
---
Labels: hive-4.0.1-merged hive-4.0.1-must pull-request-available  (was: 
hive-4.0.1-must pull-request-available)

> Restore hive-exec-core jar
> --
>
> Key: HIVE-28211
> URL: https://issues.apache.org/jira/browse/HIVE-28211
> Project: Hive
>  Issue Type: Task
>Reporter: Simhadri Govindappa
>Assignee: Simhadri Govindappa
>Priority: Major
>  Labels: hive-4.0.1-merged, hive-4.0.1-must, 
> pull-request-available
> Fix For: 4.1.0
>
>
> The hive-exec-core jar is used by spark, oozie, hudi and many other pojects. 
> Removal of the hive-exec-core jar has caused the following issues.
> Spark : [https://lists.apache.org/list?d...@hive.apache.org:lte=1M:joda]
> Oozie: [https://lists.apache.org/thread/yld75ltf9y8d9q3cow3xqlg0fqyj6mkg]
> Hudi: [apache/hudi#8147|https://github.com/apache/hudi/issues/8147]
> Until the we shade & relocate dependencies in hive-exec, we should restore 
> the hive-exec core jar .



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-28254) CBO (Calcite Return Path): Multiple DISTINCT leads to wrong results

2024-06-15 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-28254:
---
Labels: hive-4.0.1-merged hive-4.0.1-must pull-request-available  (was: 
hive-4.0.1-must pull-request-available)

> CBO (Calcite Return Path): Multiple DISTINCT leads to wrong results
> ---
>
> Key: HIVE-28254
> URL: https://issues.apache.org/jira/browse/HIVE-28254
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Affects Versions: 4.0.0
>Reporter: Shohei Okumiya
>Assignee: Shohei Okumiya
>Priority: Major
>  Labels: hive-4.0.1-merged, hive-4.0.1-must, 
> pull-request-available
> Fix For: 4.1.0
>
>
> CBO return path can build incorrect GroupByOperator when multiple 
> aggregations with DISTINCT are involved.
> This is an example.
> {code:java}
> CREATE TABLE test (col1 INT, col2 INT);
> INSERT INTO test VALUES (1, 100), (2, 200), (2, 200), (3, 300);
> set hive.cbo.returnpath.hiveop=true;
> set hive.map.aggr=false;
> SELECT
>   SUM(DISTINCT col1),
>   COUNT(DISTINCT col1),
>   SUM(DISTINCT col2),
>   SUM(col2)
> FROM test;{code}
> The last column should be 800. But the SUM refers to col1 and the actual 
> result is 8.
> {code:java}
> +--+--+--+--+
> | _c0  | _c1  | _c2  | _c3  |
> +--+--+--+--+
> | 6    | 3    | 600  | 8    |
> +--+--+--+--+ {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-28202) Incorrect projected column size after ORC upgrade to v1.6.7

2024-06-15 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-28202:
---
Labels: hive-4.0.1-merged hive-4.0.1-must performance 
pull-request-available  (was: hive-4.0.1-must performance 
pull-request-available)

> Incorrect projected column size after ORC upgrade to v1.6.7 
> 
>
> Key: HIVE-28202
> URL: https://issues.apache.org/jira/browse/HIVE-28202
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0, 4.0.0-beta-1
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Critical
>  Labels: hive-4.0.1-merged, hive-4.0.1-must, performance, 
> pull-request-available
> Fix For: 4.1.0
>
>
> `ReaderImpl.getRawDataSizeFromColIndices` changed behavior for handling 
> struct type and now includes their subtypes. That caused an issue in Hive as 
> the root struct index is always "included", causing size estimation for the 
> complete schema, not just selected columns leading to incorrect split 
> estimations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-28190) Materialized view rebuild lock heart-beating is broken

2024-06-15 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-28190:
---
Labels: hive-4.0.1-merged hive-4.0.1-must pull-request-available  (was: 
hive-4.0.1-must pull-request-available)

> Materialized view rebuild lock heart-beating is broken
> --
>
> Key: HIVE-28190
> URL: https://issues.apache.org/jira/browse/HIVE-28190
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Zsolt Miskolczi
>Assignee: Zsolt Miskolczi
>Priority: Critical
>  Labels: hive-4.0.1-merged, hive-4.0.1-must, 
> pull-request-available
> Fix For: 4.1.0
>
>
> It fails with the following error: 
> {code:java}
> org.springframework.dao.InvalidDataAccessApiUsageException: SQL [UPDATE 
> "MATERIALIZATION_REBUILD_LOCKS" SET "MRL_LAST_HEARTBEAT" = 1712571919559 
> WHERE "MRL_TXN_ID" = 2297 AND "MRL_DB_NAME" = ? AND "MRL_TBL_NAME" = ?]: 
> given 2 parameters but expected 0 {code}
> We didn't spot it so far as when the heartbeat of materialized view fails 
> with an error, it doesn't affect the rebuild query run. So that it can be 
> only spotted by actively watching the logs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-28166) Iceberg: Truncate on branch operates on the main table

2024-06-15 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-28166:
---
Labels: hive-4.0.1-merged hive-4.0.1-must pull-request-available  (was: 
hive-4.0.1-must pull-request-available)

> Iceberg: Truncate on branch operates on the main table
> --
>
> Key: HIVE-28166
> URL: https://issues.apache.org/jira/browse/HIVE-28166
> Project: Hive
>  Issue Type: Bug
>  Components: Iceberg integration
>Affects Versions: 4.0.0
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Critical
>  Labels: hive-4.0.1-merged, hive-4.0.1-must, 
> pull-request-available
> Fix For: 4.1.0
>
>
> Shooting a truncate operation on an iceberg table branching is operating on 
> the main table instead of operating on the main table



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-28158) Add ASF license header in non-java files

2024-06-15 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-28158:
---
Labels: hive-4.0.1-merged hive-4.0.1-must  (was: hive-4.0.1-must)

> Add ASF license header in non-java files
> 
>
> Key: HIVE-28158
> URL: https://issues.apache.org/jira/browse/HIVE-28158
> Project: Hive
>  Issue Type: Task
>  Components: Documentation
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: hive-4.0.1-merged, hive-4.0.1-must
> Fix For: 4.1.0
>
>
> According to the a [ASF policy|https://www.apache.org/legal/src-headers.html] 
> all source files should contain an ASF header. Currently there are a lot of 
> source files that do not contain the ASF header. The files can be broken into 
> the following categories:
> *Must have:*
>  * Python files (.py)
>  * Bash/Shell script files (.sh)
>  * Javascript files (.js)
> *Should have:*
>  * Maven files (pom.xml)
>  * GitHub workflows and Docker files (.yml)
> *Good to have:*
>  * Hive/Tez/Yarn and other configuration files (.xml)
>  * Log4J property files (.properties)
>  * Markdown files (.md)
> *Could have but OK if they don't:*
>  * Data files for tests (data/files/**)
>  * Generated code files (src/gen)
>  * QTest input/output files (.q, .q.out)
>  * IntelliJ files (.idea)
>  * Other txt and data files
> The changes here aim to address the first three categories (must, should, 
> good) and add the missing header when possible.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-28098) Fails to copy empty column statistics of materialized CTE

2024-06-15 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-28098:
---
Labels: hive-4.0.1-merged hive-4.0.1-must pull-request-available  (was: 
hive-4.0.1-must pull-request-available)

> Fails to copy empty column statistics of materialized CTE
> -
>
> Key: HIVE-28098
> URL: https://issues.apache.org/jira/browse/HIVE-28098
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Shohei Okumiya
>Assignee: Shohei Okumiya
>Priority: Major
>  Labels: hive-4.0.1-merged, hive-4.0.1-must, 
> pull-request-available
> Fix For: 4.1.0
>
>
> HIVE-28080 introduced the optimization of materialized CTEs, but it turned 
> out that it failed when statistics were empty.
> This query reproduces the issue.
> {code:java}
> set hive.stats.autogather=false;
> CREATE TABLE src_no_stats AS SELECT '123' as key, 'val123' as value UNION ALL 
> SELECT '9' as key, 'val9' as value;
> set hive.optimize.cte.materialize.threshold=2;
> set hive.optimize.cte.materialize.full.aggregate.only=false;
> EXPLAIN WITH materialized_cte1 AS (
>   SELECT * FROM src_no_stats
> ),
> materialized_cte2 AS (
>   SELECT a.key
>   FROM materialized_cte1 a
>   JOIN materialized_cte1 b ON (a.key = b.key)
> )
> SELECT a.key
> FROM materialized_cte2 a
> JOIN materialized_cte2 b ON (a.key = b.key); {code}
> It throws an error.
> {code:java}
> Error: Error while compiling statement: FAILED: IllegalStateException The 
> size of col stats must be equal to that of schema. Stats = [], Schema = [key] 
> (state=42000,code=4) {code}
> Attaching a debugger, FSO of materialized_cte2 has empty stats as 
> JoinOperator loses stats.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-28134) Improve SecureCmdDoAs

2024-06-15 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-28134:
---
Labels: hive-4.0.1-merged hive-4.0.1-must pull-request-available  (was: 
hive-4.0.1-must pull-request-available)

> Improve SecureCmdDoAs
> -
>
> Key: HIVE-28134
> URL: https://issues.apache.org/jira/browse/HIVE-28134
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: hive-4.0.1-merged, hive-4.0.1-must, 
> pull-request-available
> Fix For: 4.1.0
>
>
> Improve the SecureCmdDoAs code



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-28161) Incorrect Copyright years in META-INF/NOTICE files

2024-06-15 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-28161:
---
Labels: hive-4.0.1-merged hive-4.0.1-must pull-request-available  (was: 
hive-4.0.1-must pull-request-available)

> Incorrect Copyright years in META-INF/NOTICE files
> --
>
> Key: HIVE-28161
> URL: https://issues.apache.org/jira/browse/HIVE-28161
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0-beta-1
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: hive-4.0.1-merged, hive-4.0.1-must, 
> pull-request-available
> Fix For: 4.1.0
>
>
> The generated META-INF/NOTICE file which resides inside each jar produced by 
> Hive has incorrect copyright years.
> Inside all jars the NOTICE file has the following incorrect content:
> {noformat}
> Copyright 2020 The Apache Software Foundation
> {noformat}
> The Copyright statement should include the timespan from the inception of the 
> project to now.
> {noformat}
> Copyright 2008-2024 The Apache Software Foundation
> {noformat}
> The problem can be easily seen by inspecting the jar content after building 
> the module or checking the previously published jars in Maven central.
> {noformat}
> mvn clean install -DskipTests -pl common/
> jar xf common/target/hive-common-4.1.0-SNAPSHOT.jar META-INF
> cat META-INF/NOTICE 
> Hive Common
> Copyright 2020 The Apache Software Foundation
> This product includes software developed at
> The Apache Software Foundation (http://www.apache.org/).
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-27847) Prevent query Failures on Numeric <-> Timestamp

2024-06-15 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-27847:
---
Labels: hive-4.0.1-merged hive-4.0.1-must pull-request-available  (was: 
hive-4.0.1-must pull-request-available)

>  Prevent query Failures on Numeric <-> Timestamp
> 
>
> Key: HIVE-27847
> URL: https://issues.apache.org/jira/browse/HIVE-27847
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0-alpha-1, 4.0.0-alpha-2
> Environment: master
> 4.0.0-alpha-1
>Reporter: Basapuram Kumar
>Assignee: Shohei Okumiya
>Priority: Major
>  Labels: hive-4.0.1-merged, hive-4.0.1-must, 
> pull-request-available
> Fix For: 4.1.0
>
> Attachments: HIVE-27847.patch
>
>
> In Master/4.0.0-alpha-1 branches, performing the Numeric to Timestamp 
> conversion, its failing with the error as 
> "{color:#de350b}org.apache.hadoop.hive.ql.exec.UDFArgumentException: Casting 
> NUMERIC types to TIMESTAMP is prohibited 
> (hive.strict.timestamp.conversion){color}" .
>  
> *Repro steps.*
>  # Sample data
> {noformat}
> $ hdfs dfs -cat /tmp/tc/t.csv
> 1653209895687,2022-05-22T15:58:15.931+07:00
> 1653209938316,2022-05-22T15:58:58.490+07:00
> 1653209962021,2022-05-22T15:59:22.191+07:00
> 1653210021993,2022-05-22T16:00:22.174+07:00
> 1653209890524,2022-05-22T15:58:10.724+07:00
> 1653210095382,2022-05-22T16:01:35.775+07:00
> 1653210044308,2022-05-22T16:00:44.683+07:00
> 1653210098546,2022-05-22T16:01:38.886+07:00
> 1653210012220,2022-05-22T16:00:12.394+07:00
> 165321376,2022-05-22T16:00:00.622+07:00{noformat}
>  # table with above data [1]
> {noformat}
> create external table   test_ts_conv(begin string, ts string) row format 
> delimited fields terminated by ',' stored as TEXTFILE LOCATION '/tmp/tc/';
> desc   test_ts_conv;
> | col_name  | data_type  | comment  |
> +---++--+
> | begin     | string     |          |
> | ts        | string     |          |
> +---++--+{noformat}
>  #  Create table with CTAS
> {noformat}
> 0: jdbc:hive2://char1000.sre.iti.acceldata.de> set 
> hive.strict.timestamp.conversion;
> +-+
> |                   set                   |
> +-+
> | hive.strict.timestamp.conversion=true  |
> +-+
> set to false
> 0: jdbc:hive2://char1000.sre.iti.acceldata.de> set 
> hive.strict.timestamp.conversion=false;
> +-+
> |                   set                   |
> +-+
> | hive.strict.timestamp.conversion=false  |
> +-+
> #Query:
> 0: jdbc:hive2://char1000.sre.iti.acceldata.de> 
> CREATE TABLE t_date 
> AS 
> select
>   CAST( CAST( `begin` AS BIGINT) / 1000  AS TIMESTAMP ) `begin`, 
>   CAST( 
> DATE_FORMAT(CAST(regexp_replace(`ts`,'(\\d{4})-(\\d{2})-(\\d{2})T(\\d{2}):(\\d{2}):(\\d{2}).(\\d{3})\\+(\\d{2}):(\\d{2})','$1-$2-$3
>  $4:$5:$6.$7') AS TIMESTAMP ),'MMdd') as BIGINT ) `par_key`
> FROM    test_ts_conv;{noformat}
> Error:
> {code:java}
> Caused by: org.apache.hadoop.hive.ql.exec.UDFArgumentException: Casting 
> NUMERIC types to TIMESTAMP is prohibited (hive.strict.timestamp.conversion)
>     at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFTimestamp.initialize(GenericUDFTimestamp.java:91)
>     at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDF.initializeAndFoldConstants(GenericUDF.java:149)
>     at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:184)
>     at 
> org.apache.hadoop.hive.ql.exec.Operator.initEvaluators(Operator.java:1073)
>     at 
> org.apache.hadoop.hive.ql.exec.Operator.initEvaluatorsAndReturnStruct(Operator.java:1099)
>     at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:74)
>     at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:360)
>     at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:549)
>     at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:503)
>     at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:369)
>     at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:549)
>     at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:503)
>     at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:369)
>     at 
> org.apache.hadoop.hive.ql.exec.MapOperator.initializeMapOperator(MapOperator.java:508)
>     at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:314)
>     ... 17 more {code}



--
This message was sent by Atlassian Jira

[jira] [Updated] (HIVE-28082) HiveAggregateReduceFunctionsRule could generate an inconsistent result

2024-06-15 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-28082:
---
Labels: hive-4.0.1-merged hive-4.0.1-must pull-request-available  (was: 
hive-4.0.1-must pull-request-available)

> HiveAggregateReduceFunctionsRule could generate an inconsistent result
> --
>
> Key: HIVE-28082
> URL: https://issues.apache.org/jira/browse/HIVE-28082
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 4.0.0-beta-1
>Reporter: Shohei Okumiya
>Assignee: Shohei Okumiya
>Priority: Major
>  Labels: hive-4.0.1-merged, hive-4.0.1-must, 
> pull-request-available
> Fix For: 4.1.0
>
>
> HiveAggregateReduceFunctionsRule translates AVG, STDDEV_POP, STDDEV_SAMP, 
> VAR_POP, and VAR_SAMP. Those UDFs accept string types and try to decode them 
> as floating point values. It is possible that undecodable values exist.
> We found that it could cause inconsistent behaviors with or without CBO.
> {code:java}
> 0: jdbc:hive2://hive-hiveserver2:1/defaul> SELECT AVG('text');
> ...
> +--+
> | _c0  |
> +--+
> | 0.0  |
> +--+
> 1 row selected (18.229 seconds)
> 0: jdbc:hive2://hive-hiveserver2:1/defaul> set hive.cbo.enable=false;
> No rows affected (0.013 seconds)
> 0: jdbc:hive2://hive-hiveserver2:1/defaul> SELECT AVG('text');
> ...
> +---+
> |  _c0  |
> +---+
> | NULL  |
> +---+ {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-28204) Remove some HMS obsolete scripts

2024-06-15 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-28204:
---
Labels: hive-4.0.1-merged hive-4.0.1-must pull-request-available  (was: 
hive-4.0.1-must pull-request-available)

> Remove some HMS obsolete scripts
> 
>
> Key: HIVE-28204
> URL: https://issues.apache.org/jira/browse/HIVE-28204
> Project: Hive
>  Issue Type: Task
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: hive-4.0.1-merged, hive-4.0.1-must, 
> pull-request-available
> Fix For: 4.1.0
>
>
> As the Hive 1.x has reached end of life, the scripts for HMS metadata need to 
> be removed from the repository and the packaged tarball, however it's better 
> to keep the script for the Hive to upgrade from 1.x and the test.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (HIVE-28204) Remove some HMS obsolete scripts

2024-06-10 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng resolved HIVE-28204.

Fix Version/s: 4.1.0
   Resolution: Fixed

> Remove some HMS obsolete scripts
> 
>
> Key: HIVE-28204
> URL: https://issues.apache.org/jira/browse/HIVE-28204
> Project: Hive
>  Issue Type: Task
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: hive-4.0.1-must, pull-request-available
> Fix For: 4.1.0
>
>
> As the Hive 1.x has reached end of life, the scripts for HMS metadata need to 
> be removed from the repository and the packaged tarball, however it's better 
> to keep the script for the Hive to upgrade from 1.x and the test.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HIVE-28308) The module `org.apache.hive:hive-jdbc:4.0.0` unintuitively removed all dependencies under the Compile Scope

2024-06-10 Thread Zhihua Deng (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-28308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17853584#comment-17853584
 ] 

Zhihua Deng commented on HIVE-28308:


I think the org.apache.hive:hive-jdbc:4.0.0 behaviors as expected, in your case 
please try
{noformat}

org.apache.hive
hive-jdbc
standalone
4.0.0
{noformat}
it would no need to add extra dependencies, but I'm not sure if it works well 
with the embedded HS2.
However there are some missing classes when I try using this standalone jar, I 
filed a Jira HIVE-28315 to address the problem.

> The module `org.apache.hive:hive-jdbc:4.0.0` unintuitively removed all 
> dependencies under the Compile Scope
> ---
>
> Key: HIVE-28308
> URL: https://issues.apache.org/jira/browse/HIVE-28308
> Project: Hive
>  Issue Type: Bug
>Reporter: Qiheng He
>Priority: Major
>
> - The module *org.apache.hive:hive-jdbc:4.0.0* unintuitively removed all 
> dependencies under the Compile Scope.
>  - Comparing the dependencies listed at 
> [https://central.sonatype.com/artifact/org.apache.hive/hive-jdbc/4.0.0/dependencies]
>  with those at 
> [https://central.sonatype.com/artifact/org.apache.hive/hive-jdbc/3.1.3/dependencies]
>  , it becomes apparent that *org.apache.hive:hive-jdbc:4.0.0* includes only 
> test dependencies. This results in the need to manually add additional 
> dependencies when utilizing the HiveServer2 JDBC Driver.
>  - 
> {code:xml}
> 
>   org.apache.hive
>   hive-jdbc
>   4.0.0
> 
> 
>   org.apache.hive
>   hive-service
>   4.0.0
> 
>   
> org.apache.logging.log4j
> log4j-slf4j-impl
>   
>   
> org.slf4j
> slf4j-reload4j
>   
>   
> org.apache.logging.log4j
> log4j-web
>   
>   
> org.apache.logging.log4j
> log4j-core
>   
>   
> org.apache.logging.log4j
> log4j-api
>   
>   
> org.apache.logging.log4j
> log4j-1.2-api
>   
>   
> 
> 
>   org.apache.hadoop
>   hadoop-client-runtime
>   3.3.6
> 
> {code}
>  - More early surveys come from 
> https://github.com/apache/shardingSphere/pull/31526 and 
> https://github.com/dbeaver/dbeaver/issues/22777 . I personally think this is 
> not a reasonable phenomenon.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-28315) Missing classes while using hive jdbc standalone jar

2024-06-10 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-28315:
---
Labels: hive-4.0.1-must  (was: )

> Missing classes while using hive jdbc standalone jar
> 
>
> Key: HIVE-28315
> URL: https://issues.apache.org/jira/browse/HIVE-28315
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zhihua Deng
>Priority: Major
>  Labels: hive-4.0.1-must
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-28315) Missing classes while using hive jdbc standalone jar

2024-06-10 Thread Zhihua Deng (Jira)

Zhihua Deng created HIVE-28315:
--

 Summary: Missing classes while using hive jdbc standalone jar
 Key: HIVE-28315
 URL: https://issues.apache.org/jira/browse/HIVE-28315
 Project: Hive
  Issue Type: Improvement
Reporter: Zhihua Deng
Assignee: Zhihua Deng






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (HIVE-28315) Missing classes while using hive jdbc standalone jar

2024-06-10 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng reassigned HIVE-28315:
--

Assignee: (was: Zhihua Deng)

> Missing classes while using hive jdbc standalone jar
> 
>
> Key: HIVE-28315
> URL: https://issues.apache.org/jira/browse/HIVE-28315
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zhihua Deng
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-28308) The module `org.apache.hive:hive-jdbc:4.0.0` unintuitively removed all dependencies under the Compile Scope

2024-06-06 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-28308:
---
Labels:   (was: hive-4.0.1-must)

> The module `org.apache.hive:hive-jdbc:4.0.0` unintuitively removed all 
> dependencies under the Compile Scope
> ---
>
> Key: HIVE-28308
> URL: https://issues.apache.org/jira/browse/HIVE-28308
> Project: Hive
>  Issue Type: Bug
>Reporter: Qiheng He
>Priority: Major
>
> - The module *org.apache.hive:hive-jdbc:4.0.0* unintuitively removed all 
> dependencies under the Compile Scope.
>  - Comparing the dependencies listed at 
> [https://central.sonatype.com/artifact/org.apache.hive/hive-jdbc/4.0.0/dependencies]
>  with those at 
> [https://central.sonatype.com/artifact/org.apache.hive/hive-jdbc/3.1.3/dependencies]
>  , it becomes apparent that *org.apache.hive:hive-jdbc:4.0.0* includes only 
> test dependencies. This results in the need to manually add additional 
> dependencies when utilizing the HiveServer2 JDBC Driver.
>  - 
> {code:xml}
> 
>   org.apache.hive
>   hive-jdbc
>   4.0.0
> 
> 
>   org.apache.hive
>   hive-service
>   4.0.0
> 
>   
> org.apache.logging.log4j
> log4j-slf4j-impl
>   
>   
> org.slf4j
> slf4j-reload4j
>   
>   
> org.apache.logging.log4j
> log4j-web
>   
>   
> org.apache.logging.log4j
> log4j-core
>   
>   
> org.apache.logging.log4j
> log4j-api
>   
>   
> org.apache.logging.log4j
> log4j-1.2-api
>   
>   
> 
> 
>   org.apache.hadoop
>   hadoop-client-runtime
>   3.3.6
> 
> {code}
>  - More early surveys come from 
> https://github.com/apache/shardingSphere/pull/31526 and 
> https://github.com/dbeaver/dbeaver/issues/22777 . I personally think this is 
> not a reasonable phenomenon.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-28308) The module `org.apache.hive:hive-jdbc:4.0.0` unintuitively removed all dependencies under the Compile Scope

2024-06-06 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-28308:
---
Labels: hive-4.0.1-must  (was: )

> The module `org.apache.hive:hive-jdbc:4.0.0` unintuitively removed all 
> dependencies under the Compile Scope
> ---
>
> Key: HIVE-28308
> URL: https://issues.apache.org/jira/browse/HIVE-28308
> Project: Hive
>  Issue Type: Bug
>Reporter: Qiheng He
>Priority: Major
>  Labels: hive-4.0.1-must
>
> - The module *org.apache.hive:hive-jdbc:4.0.0* unintuitively removed all 
> dependencies under the Compile Scope.
>  - Comparing the dependencies listed at 
> [https://central.sonatype.com/artifact/org.apache.hive/hive-jdbc/4.0.0/dependencies]
>  with those at 
> [https://central.sonatype.com/artifact/org.apache.hive/hive-jdbc/3.1.3/dependencies]
>  , it becomes apparent that *org.apache.hive:hive-jdbc:4.0.0* includes only 
> test dependencies. This results in the need to manually add additional 
> dependencies when utilizing the HiveServer2 JDBC Driver.
>  - 
> {code:xml}
> 
>   org.apache.hive
>   hive-jdbc
>   4.0.0
> 
> 
>   org.apache.hive
>   hive-service
>   4.0.0
> 
>   
> org.apache.logging.log4j
> log4j-slf4j-impl
>   
>   
> org.slf4j
> slf4j-reload4j
>   
>   
> org.apache.logging.log4j
> log4j-web
>   
>   
> org.apache.logging.log4j
> log4j-core
>   
>   
> org.apache.logging.log4j
> log4j-api
>   
>   
> org.apache.logging.log4j
> log4j-1.2-api
>   
>   
> 
> 
>   org.apache.hadoop
>   hadoop-client-runtime
>   3.3.6
> 
> {code}
>  - More early surveys come from 
> https://github.com/apache/shardingSphere/pull/31526 and 
> https://github.com/dbeaver/dbeaver/issues/22777 . I personally think this is 
> not a reasonable phenomenon.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (HIVE-26538) MetastoreDefaultTransformer should revise the location when it's empty

2024-06-06 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng reassigned HIVE-26538:
--

Assignee: (was: Zhihua Deng)

> MetastoreDefaultTransformer should revise the location when it's empty
> --
>
> Key: HIVE-26538
> URL: https://issues.apache.org/jira/browse/HIVE-26538
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The table's location is treated as null when it's empty, this takes place 
> somewhere such as:
> [https://github.com/apache/hive/blob/82f319773cb2361a98963e861fb903ab8eecd9c4/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java#L2367]
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetastoreDefaultTransformer.java#L729]
>   
> MetastoreDefaultTransformer should revise the empty location when 
> altering/creating tables.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (HIVE-28286) Add filtering support for get_table_metas API in Hive metastore

2024-06-06 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng resolved HIVE-28286.

Fix Version/s: 4.1.0
   Resolution: Fixed

Fix has been merged! Thank you Naveen for the PR and everyone involved in the 
review!

> Add filtering support for get_table_metas API in Hive metastore
> ---
>
> Key: HIVE-28286
> URL: https://issues.apache.org/jira/browse/HIVE-28286
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.1.0
>
>
> Hive Metastore has support for filtering objects thru the plugin authorizer 
> for some APIs like getTables(), getDatabases(), getDataConnectors() etc. 
> However, the same should be done for the get_table_metas() API call.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-27847) Prevent query Failures on Numeric <-> Timestamp

2024-06-05 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-27847:
---
Fix Version/s: 4.1.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Fix has been merged. Thank you [~basapuram.kumar] for issue report and the 
original work, and [~okumin] for the check and contribution!

>  Prevent query Failures on Numeric <-> Timestamp
> 
>
> Key: HIVE-27847
> URL: https://issues.apache.org/jira/browse/HIVE-27847
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0-alpha-1, 4.0.0-alpha-2
> Environment: master
> 4.0.0-alpha-1
>Reporter: Basapuram Kumar
>Assignee: Shohei Okumiya
>Priority: Major
>  Labels: hive-4.0.1-must, pull-request-available
> Fix For: 4.1.0
>
> Attachments: HIVE-27847.patch
>
>
> In Master/4.0.0-alpha-1 branches, performing the Numeric to Timestamp 
> conversion, its failing with the error as 
> "{color:#de350b}org.apache.hadoop.hive.ql.exec.UDFArgumentException: Casting 
> NUMERIC types to TIMESTAMP is prohibited 
> (hive.strict.timestamp.conversion){color}" .
>  
> *Repro steps.*
>  # Sample data
> {noformat}
> $ hdfs dfs -cat /tmp/tc/t.csv
> 1653209895687,2022-05-22T15:58:15.931+07:00
> 1653209938316,2022-05-22T15:58:58.490+07:00
> 1653209962021,2022-05-22T15:59:22.191+07:00
> 1653210021993,2022-05-22T16:00:22.174+07:00
> 1653209890524,2022-05-22T15:58:10.724+07:00
> 1653210095382,2022-05-22T16:01:35.775+07:00
> 1653210044308,2022-05-22T16:00:44.683+07:00
> 1653210098546,2022-05-22T16:01:38.886+07:00
> 1653210012220,2022-05-22T16:00:12.394+07:00
> 165321376,2022-05-22T16:00:00.622+07:00{noformat}
>  # table with above data [1]
> {noformat}
> create external table   test_ts_conv(begin string, ts string) row format 
> delimited fields terminated by ',' stored as TEXTFILE LOCATION '/tmp/tc/';
> desc   test_ts_conv;
> | col_name  | data_type  | comment  |
> +---++--+
> | begin     | string     |          |
> | ts        | string     |          |
> +---++--+{noformat}
>  #  Create table with CTAS
> {noformat}
> 0: jdbc:hive2://char1000.sre.iti.acceldata.de> set 
> hive.strict.timestamp.conversion;
> +-+
> |                   set                   |
> +-+
> | hive.strict.timestamp.conversion=true  |
> +-+
> set to false
> 0: jdbc:hive2://char1000.sre.iti.acceldata.de> set 
> hive.strict.timestamp.conversion=false;
> +-+
> |                   set                   |
> +-+
> | hive.strict.timestamp.conversion=false  |
> +-+
> #Query:
> 0: jdbc:hive2://char1000.sre.iti.acceldata.de> 
> CREATE TABLE t_date 
> AS 
> select
>   CAST( CAST( `begin` AS BIGINT) / 1000  AS TIMESTAMP ) `begin`, 
>   CAST( 
> DATE_FORMAT(CAST(regexp_replace(`ts`,'(\\d{4})-(\\d{2})-(\\d{2})T(\\d{2}):(\\d{2}):(\\d{2}).(\\d{3})\\+(\\d{2}):(\\d{2})','$1-$2-$3
>  $4:$5:$6.$7') AS TIMESTAMP ),'MMdd') as BIGINT ) `par_key`
> FROM    test_ts_conv;{noformat}
> Error:
> {code:java}
> Caused by: org.apache.hadoop.hive.ql.exec.UDFArgumentException: Casting 
> NUMERIC types to TIMESTAMP is prohibited (hive.strict.timestamp.conversion)
>     at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFTimestamp.initialize(GenericUDFTimestamp.java:91)
>     at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDF.initializeAndFoldConstants(GenericUDF.java:149)
>     at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:184)
>     at 
> org.apache.hadoop.hive.ql.exec.Operator.initEvaluators(Operator.java:1073)
>     at 
> org.apache.hadoop.hive.ql.exec.Operator.initEvaluatorsAndReturnStruct(Operator.java:1099)
>     at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:74)
>     at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:360)
>     at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:549)
>     at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:503)
>     at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:369)
>     at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:549)
>     at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:503)
>     at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:369)
>     at 
> org.apache.hadoop.hive.ql.exec.MapOperator.initializeMapOperator(MapOperator.java:508)
>     at 
>

[jira] [Comment Edited] (HIVE-28293) TestGracefulStopHS2 runs too long

2024-06-03 Thread Zhihua Deng (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-28293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17851550#comment-17851550
 ] 

Zhihua Deng edited comment on HIVE-28293 at 6/3/24 8:18 AM:


Sounds like the query is still running on cancellation until it completes(Tez) 
in this case, or could we remove this line?

[https://github.com/apache/hive/blob/8c90ec0ce576d6319470f7dc4dd27daebb654dec/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezTask.java#L420]


was (Author: dengzh):
Sounds like the query is still running on cancellation until it completes(Tez) 
in this case, or could we remove this line?

[https://github.com/apache/hive/blob/8c90ec0ce576d6319470f7dc4dd27daebb654dec/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezTask.java#L420]

 

 

> TestGracefulStopHS2 runs too long
> -
>
> Key: HIVE-28293
> URL: https://issues.apache.org/jira/browse/HIVE-28293
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: newbie
>
> it tests whether long queries are finished successfully in the event of a 
> graceful shutdown, however 10 minutes seems to be too long: 
> https://github.com/apache/hive/blob/8c90ec0ce576d6319470f7dc4dd27daebb654dec/itests/hive-unit/src/test/java/org/apache/hive/service/server/TestGracefulStopHS2.java#L85
> I'm pretty sure we can validate this in a few minutes
> {code}
> [INFO] ---
> [INFO]  T E S T S
> [INFO] ---
> [INFO] Running org.apache.hive.service.server.TestGracefulStopHS2
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
> 1,225.833 s - in org.apache.hive.service.server.TestGracefulStopHS2
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HIVE-28293) TestGracefulStopHS2 runs too long

2024-06-03 Thread Zhihua Deng (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-28293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17851550#comment-17851550
 ] 

Zhihua Deng commented on HIVE-28293:


Sounds like the query is still running on cancellation until it completes(Tez) 
in this case, or could we remove this line?

[https://github.com/apache/hive/blob/8c90ec0ce576d6319470f7dc4dd27daebb654dec/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezTask.java#L420]

 

 

> TestGracefulStopHS2 runs too long
> -
>
> Key: HIVE-28293
> URL: https://issues.apache.org/jira/browse/HIVE-28293
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: newbie
>
> it tests whether long queries are finished successfully in the event of a 
> graceful shutdown, however 10 minutes seems to be too long: 
> https://github.com/apache/hive/blob/8c90ec0ce576d6319470f7dc4dd27daebb654dec/itests/hive-unit/src/test/java/org/apache/hive/service/server/TestGracefulStopHS2.java#L85
> I'm pretty sure we can validate this in a few minutes
> {code}
> [INFO] ---
> [INFO]  T E S T S
> [INFO] ---
> [INFO] Running org.apache.hive.service.server.TestGracefulStopHS2
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
> 1,225.833 s - in org.apache.hive.service.server.TestGracefulStopHS2
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HIVE-28293) TestGracefulStopHS2 runs too long

2024-06-02 Thread Zhihua Deng (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-28293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17851517#comment-17851517
 ] 

Zhihua Deng commented on HIVE-28293:


The second query with 60ms is supposed to timeout and failed as the HS2 has 
been shutdown in 60s,

https://github.com/apache/hive/blob/8c90ec0ce576d6319470f7dc4dd27daebb654dec/itests/hive-unit/src/test/java/org/apache/hive/service/server/TestGracefulStopHS2.java#L55

it shouldn't have been running for 10min when the shutdown is triggered.

Not sure why the test spends so much time, when I test on my local laptop, it 
finishes within one minutes.

 

> TestGracefulStopHS2 runs too long
> -
>
> Key: HIVE-28293
> URL: https://issues.apache.org/jira/browse/HIVE-28293
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: newbie
>
> it tests whether long queries are finished successfully in the event of a 
> graceful shutdown, however 10 minutes seems to be too long: 
> https://github.com/apache/hive/blob/8c90ec0ce576d6319470f7dc4dd27daebb654dec/itests/hive-unit/src/test/java/org/apache/hive/service/server/TestGracefulStopHS2.java#L85
> I'm pretty sure we can validate this in a few minutes
> {code}
> [INFO] ---
> [INFO]  T E S T S
> [INFO] ---
> [INFO] Running org.apache.hive.service.server.TestGracefulStopHS2
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
> 1,225.833 s - in org.apache.hive.service.server.TestGracefulStopHS2
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (HIVE-28293) TestGracefulStopHS2 runs too long

2024-06-02 Thread Zhihua Deng (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-28293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17851517#comment-17851517
 ] 

Zhihua Deng edited comment on HIVE-28293 at 6/3/24 4:42 AM:


The second query with 60ms is supposed to timeout and failed as the HS2 has 
been shutdown in 60s,

[https://github.com/apache/hive/blob/8c90ec0ce576d6319470f7dc4dd27daebb654dec/itests/hive-unit/src/test/java/org/apache/hive/service/server/TestGracefulStopHS2.java#L55]

it shouldn't have been running for 10min when the shutdown is triggered.

Not sure why the test spends so much time, when I test on my local laptop, it 
finishes within one minutes.


was (Author: dengzh):
The second query with 60ms is supposed to timeout and failed as the HS2 has 
been shutdown in 60s,

https://github.com/apache/hive/blob/8c90ec0ce576d6319470f7dc4dd27daebb654dec/itests/hive-unit/src/test/java/org/apache/hive/service/server/TestGracefulStopHS2.java#L55

it shouldn't have been running for 10min when the shutdown is triggered.

Not sure why the test spends so much time, when I test on my local laptop, it 
finishes within one minutes.

 

> TestGracefulStopHS2 runs too long
> -
>
> Key: HIVE-28293
> URL: https://issues.apache.org/jira/browse/HIVE-28293
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: newbie
>
> it tests whether long queries are finished successfully in the event of a 
> graceful shutdown, however 10 minutes seems to be too long: 
> https://github.com/apache/hive/blob/8c90ec0ce576d6319470f7dc4dd27daebb654dec/itests/hive-unit/src/test/java/org/apache/hive/service/server/TestGracefulStopHS2.java#L85
> I'm pretty sure we can validate this in a few minutes
> {code}
> [INFO] ---
> [INFO]  T E S T S
> [INFO] ---
> [INFO] Running org.apache.hive.service.server.TestGracefulStopHS2
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
> 1,225.833 s - in org.apache.hive.service.server.TestGracefulStopHS2
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (HIVE-28293) TestGracefulStopHS2 runs too long

2024-06-02 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng reassigned HIVE-28293:
--

Assignee: Zhihua Deng

> TestGracefulStopHS2 runs too long
> -
>
> Key: HIVE-28293
> URL: https://issues.apache.org/jira/browse/HIVE-28293
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: newbie
>
> it tests whether long queries are finished successfully in the event of a 
> graceful shutdown, however 10 minutes seems to be too long: 
> https://github.com/apache/hive/blob/8c90ec0ce576d6319470f7dc4dd27daebb654dec/itests/hive-unit/src/test/java/org/apache/hive/service/server/TestGracefulStopHS2.java#L85
> I'm pretty sure we can validate this in a few minutes
> {code}
> [INFO] ---
> [INFO]  T E S T S
> [INFO] ---
> [INFO] Running org.apache.hive.service.server.TestGracefulStopHS2
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
> 1,225.833 s - in org.apache.hive.service.server.TestGracefulStopHS2
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (HIVE-28263) Metastore scripts : Update query getting stuck when sub-query of in-clause is returning empty results

2024-05-31 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng resolved HIVE-28263.

Fix Version/s: Not Applicable
   Resolution: Not A Bug

I tested the query against the target db, changing the left join to inner join 
can solve the problem, which is not the case on upstream. Closing the jira.

> Metastore scripts : Update query getting stuck when sub-query of in-clause is 
> returning empty results
> -
>
> Key: HIVE-28263
> URL: https://issues.apache.org/jira/browse/HIVE-28263
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Taraka Rama Rao Lethavadla
>Assignee: Taraka Rama Rao Lethavadla
>Priority: Major
> Fix For: Not Applicable
>
>
> As part of fix HIVE-27457
> below query is added to 
> [upgrade-4.0.0-alpha-2-to-4.0.0-beta-1.mysql.sql|https://github.com/apache/hive/blob/0e84fe2000c026afd0a49f4e7c7dd5f54fe7b1ec/standalone-metastore/metastore-server/src/main/sql/mysql/upgrade-4.0.0-alpha-2-to-4.0.0-beta-1.mysql.sql#L43]
> {noformat}
> UPDATE SERDES
> SET SERDES.SLIB = "org.apache.hadoop.hive.kudu.KuduSerDe"
> WHERE SERDE_ID IN (
> SELECT SDS.SERDE_ID
> FROM TBLS
> INNER JOIN SDS ON TBLS.SD_ID = SDS.SD_ID
> WHERE TBLS.TBL_ID IN (SELECT TBL_ID FROM TABLE_PARAMS WHERE PARAM_VALUE LIKE 
> '%KuduStorageHandler%')
> );{noformat}
> This query is getting hung when sub-query is returning empty results in MySQL
>  
>  
> {noformat}
> MariaDB [test]> SELECT TBL_ID FROM table_params WHERE PARAM_VALUE LIKE 
> '%KuduStorageHandler%';
> Empty set (0.33 sec)
> MariaDB [test]> SELECT sds.SERDE_ID FROM tbls LEFT JOIN sds ON tbls.SD_ID = 
> sds.SD_ID WHERE tbls.TBL_ID IN (SELECT TBL_ID FROM table_params WHERE 
> PARAM_VALUE LIKE '%KuduStorageHandler%');
> Empty set (0.44 sec)
> {noformat}
> And the query kept on running for more than 20 minutes
> {noformat}
> MariaDB [test]> UPDATE serdes SET serdes.SLIB = 
> "org.apache.hadoop.hive.kudu.KuduSerDe" WHERE SERDE_ID IN ( SELECT 
> sds.SERDE_ID FROM tbls LEFT JOIN sds ON tbls.SD_ID = sds.SD_ID WHERE 
> tbls.TBL_ID IN (SELECT TBL_ID FROM table_params WHERE PARAM_VALUE LIKE 
> '%KuduStorageHandler%'));
> ^CCtrl-C -- query killed. Continuing normally.
> ERROR 1317 (70100): Query execution was interrupted{noformat}
> The explain extended looks like
> {noformat}
> MariaDB [test]> explain extended UPDATE serdes SET serdes.SLIB = 
> "org.apache.hadoop.hive.kudu.KuduSerDe" WHERE SERDE_ID IN ( SELECT 
> sds.SERDE_ID FROM tbls LEFT JOIN sds ON tbls.SD_ID = sds.SD_ID WHERE 
> tbls.TBL_ID IN (SELECT TBL_ID FROM table_params WHERE PARAM_VALUE LIKE 
> '%KuduStorageHandler%'));
> +--++--++---+--+-+-++--+-+
> | id   | select_type        | table        | type   | possible_keys           
>   | key          | key_len | ref             | rows   | filtered | Extra      
>  |
> +--++--++---+--+-+-++--+-+
> |    1 | PRIMARY            | serdes       | index  | NULL                    
>   | PRIMARY      | 8       | NULL            | 401267 |   100.00 | Using 
> where |
> |    2 | DEPENDENT SUBQUERY | tbls         | index  | 
> PRIMARY,TBLS_N50,TBLS_N49 | TBLS_N50     | 9       | NULL            |  50921 
> |   100.00 | Using index |
> |    2 | DEPENDENT SUBQUERY |   | eq_ref | distinct_key            
>   | distinct_key | 8       | func            |      1 |   100.00 |            
>  |
> |    2 | DEPENDENT SUBQUERY | sds          | eq_ref | PRIMARY                 
>   | PRIMARY      | 8       | test.tbls.SD_ID |      1 |   100.00 | Using 
> where |
> |    3 | MATERIALIZED       | table_params | ALL    | 
> PRIMARY,TABLE_PARAMS_N49  | NULL         | NULL    | NULL            | 356593 
> |   100.00 | Using where |
> +--++--++---+--+-+-++--+-+
> 5 rows in set (0.00 sec){noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-26220) Shade & relocate dependencies in hive-exec to avoid conflicting with downstream projects

2024-05-19 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-26220:
---
Labels: pull-request-available  (was: hive-4.0.1-must 
pull-request-available)

> Shade & relocate dependencies in hive-exec to avoid conflicting with 
> downstream projects
> 
>
> Key: HIVE-26220
> URL: https://issues.apache.org/jira/browse/HIVE-26220
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0, 4.0.0-alpha-1
>Reporter: Chao Sun
>Priority: Blocker
>  Labels: pull-request-available
>
> Currently projects like Spark, Trino/Presto, Iceberg, etc, are depending on 
> {{hive-exec:core}} which was removed in HIVE-25531. The reason these projects 
> use {{hive-exec:core}} is because they have the flexibility to exclude, shade 
> & relocate dependencies in {{hive-exec}} that conflict with the ones they 
> brought in by themselves. However, with {{hive-exec}} this is no longer 
> possible, since it is a fat jar that shade those dependencies but do not 
> relocate many of them.
> In order for the downstream projects to consume {{hive-exec}}, we will need 
> to make sure all the dependencies in {{hive-exec}} are properly shaded and 
> relocated, so they won't cause conflicts with those from the downstream.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (HIVE-28261) Update Hive version in Docker README

2024-05-15 Thread Zhihua Deng (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-28261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846550#comment-17846550
 ] 

Zhihua Deng edited comment on HIVE-28261 at 5/15/24 8:49 AM:
-

Thank you for the PR Mert! Feel free to change the assignee to you if the 
account is ready.


was (Author: dengzh):
Thank you for the PR Mert!. Feel free to change the assignee to you if the 
account is ready.

> Update Hive version in Docker README
> 
>
> Key: HIVE-28261
> URL: https://issues.apache.org/jira/browse/HIVE-28261
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zhihua Deng
>Priority: Major
> Fix For: 4.1.0
>
>
> Add the updates on the page to the readme as quickstart shows.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (HIVE-28261) Update Hive version in Docker README

2024-05-15 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng resolved HIVE-28261.

Fix Version/s: 4.1.0
   Resolution: Fixed

Thank you for the PR Mert!. Feel free to change the assignee to you if the 
account is ready.

> Update Hive version in Docker README
> 
>
> Key: HIVE-28261
> URL: https://issues.apache.org/jira/browse/HIVE-28261
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zhihua Deng
>Priority: Major
> Fix For: 4.1.0
>
>
> Add the updates on the page to the readme as quickstart shows.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-28261) Update Hive version in Docker README

2024-05-15 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-28261:
---
Description: Add the updates on the page to the readme as quickstart shows.

> Update Hive version in Docker README
> 
>
> Key: HIVE-28261
> URL: https://issues.apache.org/jira/browse/HIVE-28261
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zhihua Deng
>Priority: Major
>
> Add the updates on the page to the readme as quickstart shows.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-28261) Update Hive version in Docker README

2024-05-15 Thread Zhihua Deng (Jira)

Zhihua Deng created HIVE-28261:
--

 Summary: Update Hive version in Docker README
 Key: HIVE-28261
 URL: https://issues.apache.org/jira/browse/HIVE-28261
 Project: Hive
  Issue Type: Improvement
Reporter: Zhihua Deng






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-28098) Fails to copy empty column statistics of materialized CTE

2024-05-15 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-28098:
---
Labels: hive-4.0.1-must pull-request-available  (was: 
pull-request-available)

> Fails to copy empty column statistics of materialized CTE
> -
>
> Key: HIVE-28098
> URL: https://issues.apache.org/jira/browse/HIVE-28098
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Shohei Okumiya
>Assignee: Shohei Okumiya
>Priority: Major
>  Labels: hive-4.0.1-must, pull-request-available
> Fix For: 4.1.0
>
>
> HIVE-28080 introduced the optimization of materialized CTEs, but it turned 
> out that it failed when statistics were empty.
> This query reproduces the issue.
> {code:java}
> set hive.stats.autogather=false;
> CREATE TABLE src_no_stats AS SELECT '123' as key, 'val123' as value UNION ALL 
> SELECT '9' as key, 'val9' as value;
> set hive.optimize.cte.materialize.threshold=2;
> set hive.optimize.cte.materialize.full.aggregate.only=false;
> EXPLAIN WITH materialized_cte1 AS (
>   SELECT * FROM src_no_stats
> ),
> materialized_cte2 AS (
>   SELECT a.key
>   FROM materialized_cte1 a
>   JOIN materialized_cte1 b ON (a.key = b.key)
> )
> SELECT a.key
> FROM materialized_cte2 a
> JOIN materialized_cte2 b ON (a.key = b.key); {code}
> It throws an error.
> {code:java}
> Error: Error while compiling statement: FAILED: IllegalStateException The 
> size of col stats must be equal to that of schema. Stats = [], Schema = [key] 
> (state=42000,code=4) {code}
> Attaching a debugger, FSO of materialized_cte2 has empty stats as 
> JoinOperator loses stats.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-27847) Prevent query Failures on Numeric <-> Timestamp

2024-05-15 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-27847:
---
Labels: hive-4.0.1-must pull-request-available  (was: 
pull-request-available)

>  Prevent query Failures on Numeric <-> Timestamp
> 
>
> Key: HIVE-27847
> URL: https://issues.apache.org/jira/browse/HIVE-27847
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0-alpha-1, 4.0.0-alpha-2
> Environment: master
> 4.0.0-alpha-1
>Reporter: Basapuram Kumar
>Assignee: Basapuram Kumar
>Priority: Major
>  Labels: hive-4.0.1-must, pull-request-available
> Attachments: HIVE-27847.patch
>
>
> In Master/4.0.0-alpha-1 branches, performing the Numeric to Timestamp 
> conversion, its failing with the error as 
> "{color:#de350b}org.apache.hadoop.hive.ql.exec.UDFArgumentException: Casting 
> NUMERIC types to TIMESTAMP is prohibited 
> (hive.strict.timestamp.conversion){color}" .
>  
> *Repro steps.*
>  # Sample data
> {noformat}
> $ hdfs dfs -cat /tmp/tc/t.csv
> 1653209895687,2022-05-22T15:58:15.931+07:00
> 1653209938316,2022-05-22T15:58:58.490+07:00
> 1653209962021,2022-05-22T15:59:22.191+07:00
> 1653210021993,2022-05-22T16:00:22.174+07:00
> 1653209890524,2022-05-22T15:58:10.724+07:00
> 1653210095382,2022-05-22T16:01:35.775+07:00
> 1653210044308,2022-05-22T16:00:44.683+07:00
> 1653210098546,2022-05-22T16:01:38.886+07:00
> 1653210012220,2022-05-22T16:00:12.394+07:00
> 165321376,2022-05-22T16:00:00.622+07:00{noformat}
>  # table with above data [1]
> {noformat}
> create external table   test_ts_conv(begin string, ts string) row format 
> delimited fields terminated by ',' stored as TEXTFILE LOCATION '/tmp/tc/';
> desc   test_ts_conv;
> | col_name  | data_type  | comment  |
> +---++--+
> | begin     | string     |          |
> | ts        | string     |          |
> +---++--+{noformat}
>  #  Create table with CTAS
> {noformat}
> 0: jdbc:hive2://char1000.sre.iti.acceldata.de> set 
> hive.strict.timestamp.conversion;
> +-+
> |                   set                   |
> +-+
> | hive.strict.timestamp.conversion=true  |
> +-+
> set to false
> 0: jdbc:hive2://char1000.sre.iti.acceldata.de> set 
> hive.strict.timestamp.conversion=false;
> +-+
> |                   set                   |
> +-+
> | hive.strict.timestamp.conversion=false  |
> +-+
> #Query:
> 0: jdbc:hive2://char1000.sre.iti.acceldata.de> 
> CREATE TABLE t_date 
> AS 
> select
>   CAST( CAST( `begin` AS BIGINT) / 1000  AS TIMESTAMP ) `begin`, 
>   CAST( 
> DATE_FORMAT(CAST(regexp_replace(`ts`,'(\\d{4})-(\\d{2})-(\\d{2})T(\\d{2}):(\\d{2}):(\\d{2}).(\\d{3})\\+(\\d{2}):(\\d{2})','$1-$2-$3
>  $4:$5:$6.$7') AS TIMESTAMP ),'MMdd') as BIGINT ) `par_key`
> FROM    test_ts_conv;{noformat}
> Error:
> {code:java}
> Caused by: org.apache.hadoop.hive.ql.exec.UDFArgumentException: Casting 
> NUMERIC types to TIMESTAMP is prohibited (hive.strict.timestamp.conversion)
>     at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFTimestamp.initialize(GenericUDFTimestamp.java:91)
>     at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDF.initializeAndFoldConstants(GenericUDF.java:149)
>     at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:184)
>     at 
> org.apache.hadoop.hive.ql.exec.Operator.initEvaluators(Operator.java:1073)
>     at 
> org.apache.hadoop.hive.ql.exec.Operator.initEvaluatorsAndReturnStruct(Operator.java:1099)
>     at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:74)
>     at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:360)
>     at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:549)
>     at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:503)
>     at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:369)
>     at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:549)
>     at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:503)
>     at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:369)
>     at 
> org.apache.hadoop.hive.ql.exec.MapOperator.initializeMapOperator(MapOperator.java:508)
>     at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:314)
>     ... 17 more {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (HIVE-28200) Improve get_partitions_by_filter/expr when partition limit enabled

2024-05-07 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng resolved HIVE-28200.

Fix Version/s: 4.1.0
   Resolution: Fixed

Fix has been pushed to master. Thank you for the PR [~wechar] !

> Improve get_partitions_by_filter/expr when partition limit enabled
> --
>
> Key: HIVE-28200
> URL: https://issues.apache.org/jira/browse/HIVE-28200
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Wechar
>Assignee: Wechar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.1.0
>
>
> When {{hive.metastore.limit.partition.request}} is configured, HMS would get 
> the matching partition counts before get the real partition objects. The 
> count  could be a slow query if the input filter or expr is too complex.
> In this case, such slow filter will be executed in both counting partition 
> numbers and fetching real partition objects, which harms the performance and 
> backend DBMS.
> We can make an improvement by getting matched partition names firstly, and 
> then check limit through the size of partition names, and finally get the 
> partitions by the partition names.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HIVE-28161) Incorrect Copyright years in META-INF/NOTICE files

2024-05-01 Thread Zhihua Deng (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-28161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17842613#comment-17842613
 ] 

Zhihua Deng commented on HIVE-28161:


Fix has been merged. Thank you for the PR [~zabetak]!

> Incorrect Copyright years in META-INF/NOTICE files
> --
>
> Key: HIVE-28161
> URL: https://issues.apache.org/jira/browse/HIVE-28161
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0-beta-1
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: hive-4.0.1-must, pull-request-available
> Fix For: 4.1.0
>
>
> The generated META-INF/NOTICE file which resides inside each jar produced by 
> Hive has incorrect copyright years.
> Inside all jars the NOTICE file has the following incorrect content:
> {noformat}
> Copyright 2020 The Apache Software Foundation
> {noformat}
> The Copyright statement should include the timespan from the inception of the 
> project to now.
> {noformat}
> Copyright 2008-2024 The Apache Software Foundation
> {noformat}
> The problem can be easily seen by inspecting the jar content after building 
> the module or checking the previously published jars in Maven central.
> {noformat}
> mvn clean install -DskipTests -pl common/
> jar xf common/target/hive-common-4.1.0-SNAPSHOT.jar META-INF
> cat META-INF/NOTICE 
> Hive Common
> Copyright 2020 The Apache Software Foundation
> This product includes software developed at
> The Apache Software Foundation (http://www.apache.org/).
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (HIVE-28161) Incorrect Copyright years in META-INF/NOTICE files

2024-05-01 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng resolved HIVE-28161.

Fix Version/s: 4.1.0
   Resolution: Fixed

> Incorrect Copyright years in META-INF/NOTICE files
> --
>
> Key: HIVE-28161
> URL: https://issues.apache.org/jira/browse/HIVE-28161
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0-beta-1
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: hive-4.0.1-must, pull-request-available
> Fix For: 4.1.0
>
>
> The generated META-INF/NOTICE file which resides inside each jar produced by 
> Hive has incorrect copyright years.
> Inside all jars the NOTICE file has the following incorrect content:
> {noformat}
> Copyright 2020 The Apache Software Foundation
> {noformat}
> The Copyright statement should include the timespan from the inception of the 
> project to now.
> {noformat}
> Copyright 2008-2024 The Apache Software Foundation
> {noformat}
> The problem can be easily seen by inspecting the jar content after building 
> the module or checking the previously published jars in Maven central.
> {noformat}
> mvn clean install -DskipTests -pl common/
> jar xf common/target/hive-common-4.1.0-SNAPSHOT.jar META-INF
> cat META-INF/NOTICE 
> Hive Common
> Copyright 2020 The Apache Software Foundation
> This product includes software developed at
> The Apache Software Foundation (http://www.apache.org/).
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-28206) Preparing for 4.0.1 development

2024-04-26 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-28206:
---
Labels:   (was: 4.)

> Preparing for 4.0.1 development
> ---
>
> Key: HIVE-28206
> URL: https://issues.apache.org/jira/browse/HIVE-28206
> Project: Hive
>  Issue Type: Task
>Reporter: Denys Kuzmenko
>Assignee: Zhihua Deng
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-28206) Preparing for 4.0.1 development

2024-04-26 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-28206:
---
Labels: 4.  (was: )

> Preparing for 4.0.1 development
> ---
>
> Key: HIVE-28206
> URL: https://issues.apache.org/jira/browse/HIVE-28206
> Project: Hive
>  Issue Type: Task
>Reporter: Denys Kuzmenko
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: 4.
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (HIVE-28206) Preparing for 4.0.1 development

2024-04-26 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng reassigned HIVE-28206:
--

Assignee: Zhihua Deng

> Preparing for 4.0.1 development
> ---
>
> Key: HIVE-28206
> URL: https://issues.apache.org/jira/browse/HIVE-28206
> Project: Hive
>  Issue Type: Task
>Reporter: Denys Kuzmenko
>Assignee: Zhihua Deng
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-28158) Add ASF license header in non-java files

2024-04-26 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-28158:
---
Labels: hive-4.0.1-must  (was: )

> Add ASF license header in non-java files
> 
>
> Key: HIVE-28158
> URL: https://issues.apache.org/jira/browse/HIVE-28158
> Project: Hive
>  Issue Type: Task
>  Components: Documentation
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: hive-4.0.1-must
> Fix For: 4.1.0
>
>
> According to the a [ASF policy|https://www.apache.org/legal/src-headers.html] 
> all source files should contain an ASF header. Currently there are a lot of 
> source files that do not contain the ASF header. The files can be broken into 
> the following categories:
> *Must have:*
>  * Python files (.py)
>  * Bash/Shell script files (.sh)
>  * Javascript files (.js)
> *Should have:*
>  * Maven files (pom.xml)
>  * GitHub workflows and Docker files (.yml)
> *Good to have:*
>  * Hive/Tez/Yarn and other configuration files (.xml)
>  * Log4J property files (.properties)
>  * Markdown files (.md)
> *Could have but OK if they don't:*
>  * Data files for tests (data/files/**)
>  * Generated code files (src/gen)
>  * QTest input/output files (.q, .q.out)
>  * IntelliJ files (.idea)
>  * Other txt and data files
> The changes here aim to address the first three categories (must, should, 
> good) and add the missing header when possible.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-28211) Restore hive-exec-core jar

2024-04-26 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-28211:
---
Labels: hive-4.0.1-must pull-request-available  (was: 
pull-request-available)

> Restore hive-exec-core jar
> --
>
> Key: HIVE-28211
> URL: https://issues.apache.org/jira/browse/HIVE-28211
> Project: Hive
>  Issue Type: Task
>Reporter: Simhadri Govindappa
>Assignee: Simhadri Govindappa
>Priority: Major
>  Labels: hive-4.0.1-must, pull-request-available
>
> The hive-exec-core jar is used by spark, oozie, hudi and many other pojects. 
> Removal of the hive-exec-core jar has caused the following issues.
> Spark : [https://lists.apache.org/list?d...@hive.apache.org:lte=1M:joda]
> Oozie: [https://lists.apache.org/thread/yld75ltf9y8d9q3cow3xqlg0fqyj6mkg]
> Hudi: [apache/hudi#8147|https://github.com/apache/hudi/issues/8147]
> Until the we shade & relocate dependencies in hive-exec, we should restore 
> the hive-exec core jar .



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-28161) Incorrect Copyright years in META-INF/NOTICE files

2024-04-26 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-28161:
---
Labels: hive-4.0.1-must pull-request-available  (was: 
pull-request-available)

> Incorrect Copyright years in META-INF/NOTICE files
> --
>
> Key: HIVE-28161
> URL: https://issues.apache.org/jira/browse/HIVE-28161
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0-beta-1
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: hive-4.0.1-must, pull-request-available
>
> The generated META-INF/NOTICE file which resides inside each jar produced by 
> Hive has incorrect copyright years.
> Inside all jars the NOTICE file has the following incorrect content:
> {noformat}
> Copyright 2020 The Apache Software Foundation
> {noformat}
> The Copyright statement should include the timespan from the inception of the 
> project to now.
> {noformat}
> Copyright 2008-2024 The Apache Software Foundation
> {noformat}
> The problem can be easily seen by inspecting the jar content after building 
> the module or checking the previously published jars in Maven central.
> {noformat}
> mvn clean install -DskipTests -pl common/
> jar xf common/target/hive-common-4.1.0-SNAPSHOT.jar META-INF
> cat META-INF/NOTICE 
> Hive Common
> Copyright 2020 The Apache Software Foundation
> This product includes software developed at
> The Apache Software Foundation (http://www.apache.org/).
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-27494) Deduplicate the task result that generated by more branches in union all

2024-04-22 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-27494:
---
Description: 
HIVE-23891 adds the ability to deduplicate the task result that under the 
directory,

//_tmp.-ext-1//HIVE_UNION_SUBDIR_1,

but turns out to ignore taking the same action to the output directory for the 
same query:

//_tmp.-ext-1//HIVE_UNION_SUBDIR_2.

So user may still have the same data duplication problem upon multiple tez task 
attempts.

  was:
HIVE-23891 adds the ability to deduplicate the task result that under the 
directory,

//_tmp.-ext-1//HIVE_UNION_SUBDIR_1,

but turns out to ignore taking the same action to the directory for the same 
query:

//_tmp.-ext-1//HIVE_UNION_SUBDIR_2.

So user may still have the same data duplication problem in multiple tez task 
attempts.


> Deduplicate the task result that generated by more branches in union all
> 
>
> Key: HIVE-27494
> URL: https://issues.apache.org/jira/browse/HIVE-27494
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-beta-1
>
> Attachments: ddl.q, explain.output
>
>
> HIVE-23891 adds the ability to deduplicate the task result that under the 
> directory,
> //_tmp.-ext-1//HIVE_UNION_SUBDIR_1,
> but turns out to ignore taking the same action to the output directory for 
> the same query:
> //_tmp.-ext-1//HIVE_UNION_SUBDIR_2.
> So user may still have the same data duplication problem upon multiple tez 
> task attempts.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-28204) Remove some HMS obsolete scripts

2024-04-17 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-28204:
---
Description: As the Hive 1.x has reached end of life, the scripts for HMS 
metadata need to be removed from the repository and the packaged tarball, 
however it's better to keep the script for the Hive to upgrade from 1.x and the 
test.  (was: As the Hive 1.x has reached end of life, the scripts for HMS 
metadata need to be removed from the repository and the packaged tarball, 
however it's better to keep the script for the Hive to upgrade from 1.x.)

> Remove some HMS obsolete scripts
> 
>
> Key: HIVE-28204
> URL: https://issues.apache.org/jira/browse/HIVE-28204
> Project: Hive
>  Issue Type: Task
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: hive-4.0.1-must, pull-request-available
>
> As the Hive 1.x has reached end of life, the scripts for HMS metadata need to 
> be removed from the repository and the packaged tarball, however it's better 
> to keep the script for the Hive to upgrade from 1.x and the test.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-28204) Remove some HMS obsolete scripts

2024-04-17 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-28204:
---
Summary: Remove some HMS obsolete scripts  (was: Remove the HMS 1.x init 
script)

> Remove some HMS obsolete scripts
> 
>
> Key: HIVE-28204
> URL: https://issues.apache.org/jira/browse/HIVE-28204
> Project: Hive
>  Issue Type: Task
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: hive-4.0.1-must, pull-request-available
>
> As the Hive 1.x has reached end of life, the scripts for HMS metadata need to 
> be removed from the repository and the packaged tarball, however it's better 
> to keep the script for the Hive to upgrade from 1.x.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-28204) Remove the HMS 1.x init script

2024-04-17 Thread Zhihua Deng (Jira)

Zhihua Deng created HIVE-28204:
--

 Summary: Remove the HMS 1.x init script
 Key: HIVE-28204
 URL: https://issues.apache.org/jira/browse/HIVE-28204
 Project: Hive
  Issue Type: Task
Reporter: Zhihua Deng
Assignee: Zhihua Deng


As the Hive 1.x has reached end of life, the scripts for HMS metadata need to 
be removed from the repository and the packaged tarball, however it's better to 
keep the script for the Hive to upgrade from 1.x.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HIVE-28182) Logs page doesn't load in Hive UI

2024-04-16 Thread Zhihua Deng (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-28182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837932#comment-17837932
 ] 

Zhihua Deng commented on HIVE-28182:


It works for me as well...

> Logs page doesn't load in Hive UI
> -
>
> Key: HIVE-28182
> URL: https://issues.apache.org/jira/browse/HIVE-28182
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Dmitriy Fingerman
>Priority: Major
> Attachments: Screen Shot 2024-04-05 at 5.16.26 PM.png, 
> image-2024-04-06-02-51-24-363.png
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HIVE-28198) Trino table is recognized as EXTERNAL_TABLE regardless of external_location parameter

2024-04-16 Thread Zhihua Deng (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-28198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837931#comment-17837931
 ] 

Zhihua Deng commented on HIVE-28198:


In Hive 4.0, we treat the Hive acid table as the managed table by default, for 
the legacy managed table, it's translated to an external table with parameter: 
TRANSLATED_TO_EXTERNAL=true, external.table.purge=true, the legacy external 
table works as the same.

> Trino table is recognized as EXTERNAL_TABLE regardless of external_location 
> parameter
> -
>
> Key: HIVE-28198
> URL: https://issues.apache.org/jira/browse/HIVE-28198
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: Mladjan Gadzic
>Priority: Major
>
> {code:java}
> trino > create table hive.default.test_table(id int);{code}
> {code:java}
> trino> delete from hive.default.test_table;
> Query 20240402_103228_00042_hm8m3, FAILED, 1 node Splits: 1 total, 0 done 
> (0.00%) 0.08 [0 rows, 0B] [0 rows/s, 0B/s]
> Query 20240402_103228_00042_hm8m3 failed: Cannot delete from non-managed Hive 
>  table{code}
> This behavior is tested and works as expected in Hive 3. Table type is stored 
> in HMS DB in {{TBLS}} table {{TBL_TYPE}} field. For Hive 3 value is 
> MANAGED_TABLE and EXTERNAL_TABLE for Hive 4.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (HIVE-28199) Docker quickstart does not work for Hive 3.1.3 on Mac M2

2024-04-16 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng resolved HIVE-28199.

Fix Version/s: 4.1.0
   Resolution: Fixed

Fix has been merged. Thank you for the PR [~ryandgoldenberg]!

> Docker quickstart does not work for Hive 3.1.3 on Mac M2
> 
>
> Key: HIVE-28199
> URL: https://issues.apache.org/jira/browse/HIVE-28199
> Project: Hive
>  Issue Type: Bug
>Reporter: Ryan Goldenberg
>Assignee: Ryan Goldenberg
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.1.0
>
>
> Quickstart: 
> [https://hive.apache.org/developement/quickstart/#--hiveserver2-metastore]
> On Mac M2, {{docker-compose up}} for {{HIVE_VERSION=3.1.3}} gives the 
> following errors
>  * {{/home/hive/.beeline}} directory issue
> {quote}metastore | *** schemaTool failed ***
> metastore | [
> metastore | WARN] Failed to create directory:
> metastore | /home/hive/.beeline
> metastore | No such file or directory
> {quote} * Underscore in network name, from {{/tmp/hive/hive.log}} on 
> {{{}hiveserver2{}}}:
> {quote}2024-04-02T16:26:24,867 ERROR [main] utils.MetaStoreUtils: Got 
> exception: java.net.URISyntaxException Illegal character in hostname at index 
> 25: thrift://metastore.docker_default:9083
> java.net.URISyntaxException: Illegal character in hostname at index 25: 
> thrift://metastore.docker_default:9083
> {quote}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (HIVE-27746) Hive Metastore should send single AlterPartitionEvent with list of partitions

2024-03-11 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng resolved HIVE-27746.

Fix Version/s: 4.1.0
   Resolution: Fixed

Merged to master. Thank you [~hemanth619] , [~jfs] and [~henrib] for the review!

A property: metastore.alterPartitions.notification.v2.enabled is introduced to 
ensure backward compatibility when it sets to false, so downstream notification 
consumers can still process the ALTER_PARTITION event without changes.

> Hive Metastore should send single AlterPartitionEvent with list of partitions
> -
>
> Key: HIVE-27746
> URL: https://issues.apache.org/jira/browse/HIVE-27746
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: Naveen Gangam
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.1.0
>
>
> In HIVE-3938, work was done to send single AddPartitionEvent for APIs that 
> add partitions in bulk. Similarly, we have alter_partitions APIs that alter 
> partitions in bulk via a single HMS call. For such events, we should also 
> send a single AlterPartitionEvent with a list of partitions in it.
> This would be way more efficient than having to send and process them 
> individually.
> This fix will be incompatible with the older clients that expect single 
> partition.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (HIVE-27805) Hive server2 connections limits bug

2024-02-28 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng reassigned HIVE-27805:
--

Assignee: Zhihua Deng

> Hive server2 connections limits bug
> ---
>
> Key: HIVE-27805
> URL: https://issues.apache.org/jira/browse/HIVE-27805
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.0, 3.0.0, 3.1.1, 3.1.2, 3.1.3
> Environment: image: apache/hive:3.1.3
> jdbc driver: hive-jdbc:2.1.0
>Reporter: Xiwei Wang
>Assignee: Zhihua Deng
>Priority: Major
>
> When I use JDBC and specify a non-existent database to connect to a 
> hiveserver2 that configured hive.server2.limit.connections.per.user=10, a 
> session initialization error 
> occurs（org.apache.hive.service.cli.HiveSQLException: Failed to open new 
> session: Database not_exists_db does not exist）; and even a normal connection 
> will report an error after the number of attempts exceeds the maximum limit I 
> configured (org.apache.hive.service.cli.HiveSQLException: Connection limit 
> per user reached (user: aeolus limit: 10))
>  
> I found that inside the method 
> org.apache.hive.service.cli.session.SessionManager#createSession
> , if seesion initialization fails, it will cause the increased number of 
> connections called incrementConnections cannot be released; after the number 
> of failures exceeds the maximum number of connections configured by the user, 
> such as hive.server2.limit.connections.per.user, hiveserver2 will not accept 
> any connections due to the limitations.
>  
> {code:java}
> 2023-10-17T12:14:54,313  WARN [HiveServer2-Handler-Pool: Thread-3329] 
> thrift.ThriftCLIService: Error opening session:
> org.apache.hive.service.cli.HiveSQLException: Failed to open new session: 
> Database not_exists_db does not exist
>     at 
> org.apache.hive.service.cli.session.SessionManager.createSession(SessionManager.java:434)
>  ~[hive-service-3.1.3.jar:3.1.3]
>     at 
> org.apache.hive.service.cli.session.SessionManager.openSession(SessionManager.java:373)
>  ~[hive-service-3.1.3.jar:3.1.3]
>     at 
> org.apache.hive.service.cli.CLIService.openSession(CLIService.java:187) 
> ~[hive-service-3.1.3.jar:3.1.3]
>     at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.getSessionHandle(ThriftCLIService.java:475)
>  ~[hive-service-3.1.3.jar:3.1.3]
>     at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.OpenSession(ThriftCLIService.java:322)
>  ~[hive-service-3.1.3.jar:3.1.3]
>     at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1497)
>  ~[hive-exec-3.1.3.jar:3.1.3]
>     at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1482)
>  ~[hive-exec-3.1.3.jar:3.1.3]
>     at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) 
> ~[hive-exec-3.1.3.jar:3.1.3]
>     at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) 
> ~[hive-exec-3.1.3.jar:3.1.3]
>     at 
> org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
>  ~[hive-service-3.1.3.jar:3.1.3]
>     at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>  ~[hive-exec-3.1.3.jar:3.1.3]
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  ~[?:1.8.0_342]
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  ~[?:1.8.0_342]
>     at java.lang.Thread.run(Thread.java:750) [?:1.8.0_342]
> Caused by: org.apache.hive.service.cli.HiveSQLException: Database dw_aeolus 
> does not exist
>     at 
> org.apache.hive.service.cli.session.HiveSessionImpl.configureSession(HiveSessionImpl.java:294)
>  ~[hive-service-3.1.3.jar:3.1.3]
>     at 
> org.apache.hive.service.cli.session.HiveSessionImpl.open(HiveSessionImpl.java:199)
>  ~[hive-service-3.1.3.jar:3.1.3]
>     at 
> org.apache.hive.service.cli.session.SessionManager.createSession(SessionManager.java:425)
>  ~[hive-service-3.1.3.jar:3.1.3]
>     ... 13 more
> 2023-10-17T12:14:54,972  INFO [HiveServer2-Handler-Pool: Thread-3330] 
> thrift.ThriftCLIService: Client protocol version: HIVE_CLI_SERVICE_PROTOCOL_V8
> 2023-10-17T12:14:54,973 ERROR [HiveServer2-Handler-Pool: Thread-3330] 
> service.CompositeService: Connection limit per user reached (user: aeolus 
> limit: 10)
> 2023-10-17T12:14:54,973  WARN [HiveServer2-Handler-Pool: Thread-3330] 
> thrift.ThriftCLIService: Error opening session:
> org.apache.hive.service.cli.HiveSQLException: Connection limit per user 
> reached (user: aeolus limit: 10)
>     at 
> org.apache.hive.service.cli.session.SessionManager.incrementConnections(SessionManager.java:476)
>  ~[hive-service-3.1.3.jar:3.1.3]
>     at 
>

[jira] [Resolved] (HIVE-27692) Explore removing the always task from embedded HMS

2024-02-23 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng resolved HIVE-27692.

Fix Version/s: 4.0.0
   Resolution: Fixed

Fix has been merged. Thank you Naveen for the review!

> Explore removing the always task from embedded HMS
> --
>
> Key: HIVE-27692
> URL: https://issues.apache.org/jira/browse/HIVE-27692
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> The always tasks are running in the leader HMS now, the properties for 
> configuring the leader should only belong to HMS, other engines such as 
> Spark/Impala doesn't need to know these properties. For most cases, the 
> engine only cares about the properties for connecting HMS, e.g, 
> hive.metastore.uris.
> Every time when a new apps uses an embedded Metastore, it will start the HMS 
> always tasks by default. Imaging we have hundreds of apps, then hundreds of 
> pieces of the same tasks are running, this will put extra burden to the 
> underlying databases, such as the flooding queries, connection limit.
> I think we can remove always tasks from the embeded Metastore, the always 
> task will be taken care of by the standalone Metastore, as a standalone 
> Metastore should be here in production environment.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (HIVE-26435) Add method for collecting HMS meta summary

2024-02-21 Thread Zhihua Deng (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-26435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17819470#comment-17819470
 ] 

Zhihua Deng edited comment on HIVE-26435 at 2/22/24 4:55 AM:
-

Fix has been merged into master, Thank you [~danielzhu] and [~ruyi.zheng] for 
the work! 


was (Author: dengzh):
Fix has been merged into master, Thank you [~danielzhu] for the work! 

> Add method for collecting HMS meta summary
> --
>
> Key: HIVE-26435
> URL: https://issues.apache.org/jira/browse/HIVE-26435
> Project: Hive
>  Issue Type: New Feature
>Reporter: Ruyi Zheng
>Assignee: Hongdan Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Hive Metastore currently lacks visibility into its metadata. This work 
> includes enhancing the Hive Metatool to include an option(JSON, CONSOLE) to 
> print a summary (catalog name, database name, table name, partition column 
> count, number of rows. table type, file type, compression type, total data 
> size, etc). 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (HIVE-26435) Add method for collecting HMS meta summary

2024-02-21 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng resolved HIVE-26435.

Fix Version/s: 4.0.0
 Assignee: Hongdan Zhu  (was: Naveen Gangam)
   Resolution: Fixed

Fix has been merged into master, Thank you [~danielzhu] for the work! 

> Add method for collecting HMS meta summary
> --
>
> Key: HIVE-26435
> URL: https://issues.apache.org/jira/browse/HIVE-26435
> Project: Hive
>  Issue Type: New Feature
>Reporter: Ruyi Zheng
>Assignee: Hongdan Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Hive Metastore currently lacks visibility into its metadata. This work 
> includes enhancing the Hive Metatool to include an option(JSON, CONSOLE) to 
> print a summary (catalog name, database name, table name, partition column 
> count, number of rows. table type, file type, compression type, total data 
> size, etc). 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-26435) Add method for collecting HMS meta summary

2024-02-21 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-26435:
---
Summary: Add method for collecting HMS meta summary  (was: HMS Summary)

> Add method for collecting HMS meta summary
> --
>
> Key: HIVE-26435
> URL: https://issues.apache.org/jira/browse/HIVE-26435
> Project: Hive
>  Issue Type: New Feature
>Reporter: Ruyi Zheng
>Assignee: Naveen Gangam
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Hive Metastore currently lacks visibility into its metadata. This work 
> includes enhancing the Hive Metatool to include an option(JSON, CONSOLE) to 
> print a summary (catalog name, database name, table name, partition column 
> count, number of rows. table type, file type, compression type, total data 
> size, etc). 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (HIVE-27778) Alter table command gives error after computer stats is run with Impala

2024-02-21 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng resolved HIVE-27778.

Fix Version/s: 4.0.0
   Resolution: Fixed

Fix has been merged into master. Thank you [~dkuzmenko] and [~zhangbutao] for 
the review!

> Alter table command gives error after computer stats is run with Impala
> ---
>
> Key: HIVE-27778
> URL: https://issues.apache.org/jira/browse/HIVE-27778
> Project: Hive
>  Issue Type: Bug
>Reporter: Kokila N
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Hive partitioned table's column stats are stored on partition level which is 
> in PART_COL_STATS in sys.
> When "column stats " query is run in Impala on a Hive partitioned 
> table generates column stats on table level and is stored in TAB_COL_STATS.
> So, Executing "Alter rename  to " after impala compute 
> stats throws error.
> {code:java}
> ERROR : Failed
> org.apache.hadoop.hive.ql.metadata.HiveException: Unable to alter table. 
> Cannot change stats state for a transactional table default.parqtest without 
> providing the transactional write state for verification (new write ID 6, 
> valid write IDs null; current state null; new state {} {code}
> The column stats generated from impala needs to be deleted for alter command 
> to work.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (HIVE-27778) Alter table command gives error after computer stats is run with Impala

2024-01-25 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng reassigned HIVE-27778:
--

Assignee: Zhihua Deng

> Alter table command gives error after computer stats is run with Impala
> ---
>
> Key: HIVE-27778
> URL: https://issues.apache.org/jira/browse/HIVE-27778
> Project: Hive
>  Issue Type: Bug
>Reporter: Kokila N
>Assignee: Zhihua Deng
>Priority: Major
>
> Hive partitioned table's column stats are stored on partition level which is 
> in PART_COL_STATS in sys.
> When "column stats " query is run in Impala on a Hive partitioned 
> table generates column stats on table level and is stored in TAB_COL_STATS.
> So, Executing "Alter rename  to " after impala compute 
> stats throws error.
> {code:java}
> ERROR : Failed
> org.apache.hadoop.hive.ql.metadata.HiveException: Unable to alter table. 
> Cannot change stats state for a transactional table default.parqtest without 
> providing the transactional write state for verification (new write ID 6, 
> valid write IDs null; current state null; new state {} {code}
> The column stats generated from impala needs to be deleted for alter command 
> to work.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HIVE-27775) DirectSQL and JDO results are different when fetching partitions by timestamp in DST shift

2024-01-19 Thread Zhihua Deng (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-27775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17808855#comment-17808855
 ] 

Zhihua Deng commented on HIVE-27775:


Not sure how many users cloud use timestamp as the type for the partition 
column, looks like this is a new feature introduced in Hive 4.0.

>From the Jira description, the partition predicate in the query

 
{code:java}
SELECT * FROM payments WHERE txn_datetime = '2023-03-26 02:30:00'; {code}
returns an empty result on direct sql, while a partition('2023-03-26 02:30:00') 
on the JDO, provided the TIMESTAMP data type in Hive is timezone agnostic, so I 
assume the result from JDO is correct.

Besides this difference, when I tested the date/timestamp partition column 
against Postgres and MySQL, there're some problems:

Postgres(direct sql):

operator does not exist: timestamp without time zone = character varying

MySQL(direct sql):

You have an error in your SQL syntax; check the manual that corresponds to your 
MySQL server version for the right syntax to use near 'TIMESTAMP) else null 
end) as TIMESTAMP) = '2023-03-26 03:30:00'))' at line 1 

 

> DirectSQL and JDO results are different when fetching partitions by timestamp 
> in DST shift
> --
>
> Key: HIVE-27775
> URL: https://issues.apache.org/jira/browse/HIVE-27775
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 4.0.0-beta-1
>Reporter: Stamatis Zampetakis
>Assignee: Zhihua Deng
>Priority: Critical
>  Labels: pull-request-available
>
> DirectSQL and JDO results are different when fetching partitions by timestamp 
> in DST shift.
> {code:sql}
> --! qt:timezone:Europe/Paris
> CREATE EXTERNAL TABLE payments (card string) PARTITIONED BY(txn_datetime 
> TIMESTAMP) STORED AS ORC;
> INSERT into payments VALUES('---', '2023-03-26 02:30:00');
> SELECT * FROM payments WHERE txn_datetime = '2023-03-26 02:30:00';
> {code}
> The '2023-03-26 02:30:00' is a timestamp that in Europe/Paris timezone falls 
> exactly in the middle of the DST shift. In this particular timezone this date 
> time never really exists since we are jumping directly from 02:00:00 to 
> 03:00:00. However, the TIMESTAMP data type in Hive is timezone agnostic 
> (https://cwiki.apache.org/confluence/display/Hive/Different+TIMESTAMP+types) 
> so it is a perfectly valid timestamp that can be inserted in a table and we 
> must be able to recover it back.
> For the SELECT query above, partition pruning kicks in and calls the 
> ObjectStore#getPartitionsByExpr method in order to fetch the respective 
> partitions matching the timestamp from HMS.
> The tests however reveal that DirectSQL and JDO paths are not returning the 
> same results leading to an exception when VerifyingObjectStore is used. 
> According to the error below DirectSQL is able to recover one partition from 
> HMS (expected) while JDO/ORM returns empty (not expected).
> {noformat}
> 2023-10-06T03:51:19,406 ERROR [80252df4-3fdc-4971-badf-ad67ce8567c7 main] 
> metastore.VerifyingObjectStore: Lists are not the same size: SQL 1, ORM 0
> 2023-10-06T03:51:19,409 ERROR [80252df4-3fdc-4971-badf-ad67ce8567c7 main] 
> metastore.RetryingHMSHandler: MetaException(message:Lists are not the same 
> size: SQL 1, ORM 0)
>   at 
> org.apache.hadoop.hive.metastore.VerifyingObjectStore.verifyLists(VerifyingObjectStore.java:148)
>   at 
> org.apache.hadoop.hive.metastore.VerifyingObjectStore.getPartitionsByExpr(VerifyingObjectStore.java:88)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97)
>   at com.sun.proxy.$Proxy57.getPartitionsByExpr(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.HMSHandler.get_partitions_spec_by_expr(HMSHandler.java:7330)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:98)
>   at 
> org.apache.hadoop.hive.metastore.AbstractHMSHandlerProxy.invoke(AbstractHMSHandlerProxy.java:82)
>   at com.sun.proxy.$Proxy59.get_partitions_spec_by_expr(Unknown Source)
>   at 
>

[jira] [Comment Edited] (HIVE-27775) DirectSQL and JDO results are different when fetching partitions by timestamp in DST shift

2024-01-19 Thread Zhihua Deng (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-27775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17808855#comment-17808855
 ] 

Zhihua Deng edited comment on HIVE-27775 at 1/20/24 2:27 AM:
-

Not sure how many users cloud use timestamp as the type for the partition 
column, looks like this is a new feature introduced in Hive 4.0.

>From the Jira description, the partition predicate in the query
{code:java}
SELECT * FROM payments WHERE txn_datetime = '2023-03-26 02:30:00'; {code}
returns an empty result on direct sql, while a partition('2023-03-26 02:30:00') 
on the JDO, provided the TIMESTAMP data type in Hive is timezone agnostic, so I 
assume the result from JDO is correct.

Besides this difference, when I tested the date/timestamp partition column 
against Postgres and MySQL, there're some problems:

Postgres(direct sql):

operator does not exist: timestamp without time zone = character varying

MySQL(direct sql):

You have an error in your SQL syntax; check the manual that corresponds to your 
MySQL server version for the right syntax to use near 'TIMESTAMP) else null 
end) as TIMESTAMP) = '2023-03-26 03:30:00'))' at line 1 


was (Author: dengzh):
Not sure how many users cloud use timestamp as the type for the partition 
column, looks like this is a new feature introduced in Hive 4.0.

>From the Jira description, the partition predicate in the query

 
{code:java}
SELECT * FROM payments WHERE txn_datetime = '2023-03-26 02:30:00'; {code}
returns an empty result on direct sql, while a partition('2023-03-26 02:30:00') 
on the JDO, provided the TIMESTAMP data type in Hive is timezone agnostic, so I 
assume the result from JDO is correct.

Besides this difference, when I tested the date/timestamp partition column 
against Postgres and MySQL, there're some problems:

Postgres(direct sql):

operator does not exist: timestamp without time zone = character varying

MySQL(direct sql):

You have an error in your SQL syntax; check the manual that corresponds to your 
MySQL server version for the right syntax to use near 'TIMESTAMP) else null 
end) as TIMESTAMP) = '2023-03-26 03:30:00'))' at line 1 

 

> DirectSQL and JDO results are different when fetching partitions by timestamp 
> in DST shift
> --
>
> Key: HIVE-27775
> URL: https://issues.apache.org/jira/browse/HIVE-27775
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 4.0.0-beta-1
>Reporter: Stamatis Zampetakis
>Assignee: Zhihua Deng
>Priority: Critical
>  Labels: pull-request-available
>
> DirectSQL and JDO results are different when fetching partitions by timestamp 
> in DST shift.
> {code:sql}
> --! qt:timezone:Europe/Paris
> CREATE EXTERNAL TABLE payments (card string) PARTITIONED BY(txn_datetime 
> TIMESTAMP) STORED AS ORC;
> INSERT into payments VALUES('---', '2023-03-26 02:30:00');
> SELECT * FROM payments WHERE txn_datetime = '2023-03-26 02:30:00';
> {code}
> The '2023-03-26 02:30:00' is a timestamp that in Europe/Paris timezone falls 
> exactly in the middle of the DST shift. In this particular timezone this date 
> time never really exists since we are jumping directly from 02:00:00 to 
> 03:00:00. However, the TIMESTAMP data type in Hive is timezone agnostic 
> (https://cwiki.apache.org/confluence/display/Hive/Different+TIMESTAMP+types) 
> so it is a perfectly valid timestamp that can be inserted in a table and we 
> must be able to recover it back.
> For the SELECT query above, partition pruning kicks in and calls the 
> ObjectStore#getPartitionsByExpr method in order to fetch the respective 
> partitions matching the timestamp from HMS.
> The tests however reveal that DirectSQL and JDO paths are not returning the 
> same results leading to an exception when VerifyingObjectStore is used. 
> According to the error below DirectSQL is able to recover one partition from 
> HMS (expected) while JDO/ORM returns empty (not expected).
> {noformat}
> 2023-10-06T03:51:19,406 ERROR [80252df4-3fdc-4971-badf-ad67ce8567c7 main] 
> metastore.VerifyingObjectStore: Lists are not the same size: SQL 1, ORM 0
> 2023-10-06T03:51:19,409 ERROR [80252df4-3fdc-4971-badf-ad67ce8567c7 main] 
> metastore.RetryingHMSHandler: MetaException(message:Lists are not the same 
> size: SQL 1, ORM 0)
>   at 
> org.apache.hadoop.hive.metastore.VerifyingObjectStore.verifyLists(VerifyingObjectStore.java:148)
>   at 
> org.apache.hadoop.hive.metastore.VerifyingObjectStore.getPartitionsByExpr(VerifyingObjectStore.java:88)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
>

[jira] [Resolved] (HIVE-27994) Optimize renaming the partitioned table

2024-01-19 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng resolved HIVE-27994.

Fix Version/s: 4.0.0
   Resolution: Fixed

Fix has been merged. Thank you [~zhangbutao] and [~hemanth619] for the review!

> Optimize renaming the partitioned table
> ---
>
> Key: HIVE-27994
> URL: https://issues.apache.org/jira/browse/HIVE-27994
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> In case of table rename, every row in PART_COL_STATS associated with the 
> table should be fetched, stored in memory, delete & re-insert with new 
> db/table name, this could take hours if the table has thousands of column 
> statistics in PART_COL_STATS.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-28012) Invalid reference to the newly added column

2024-01-18 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-28012:
---
Description: 
Steps to repro:
{code:java}
--! qt:dataset:src
--! qt:dataset:part
set hive.stats.autogather=true;
set hive.stats.column.autogather=true;
set 
metastore.metadata.transformer.class=org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer;
set hive.metastore.client.capabilities=HIVEFULLACIDWRITE,HIVEFULLACIDREAD;
set hive.create.as.external.legacy=true;
set hive.support.concurrency=true;
set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;

CREATE TABLE rename_partition_table0 (key STRING, value STRING) PARTITIONED BY 
(part STRING) STORED AS ORC;
INSERT OVERWRITE TABLE rename_partition_table0 PARTITION (part = '1') SELECT * 
FROM src where rand(1) < 0.5;
ALTER TABLE rename_partition_table0 ADD COLUMNS (new_col INT);
INSERT OVERWRITE TABLE rename_partition_table0 PARTITION (part = '2') SELECT 
src.*, 1 FROM src;
{code}
Set hive.metastore.client.cache.v2.enabled=false can act as a workaround.

  was:
Steps to repro:

 
{code:java}
--! qt:dataset:src
--! qt:dataset:part
set hive.stats.autogather=true;
set hive.stats.column.autogather=true;
set 
metastore.metadata.transformer.class=org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer;
set hive.metastore.client.capabilities=HIVEFULLACIDWRITE,HIVEFULLACIDREAD;
set hive.create.as.external.legacy=true;
set hive.support.concurrency=true;
set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;

CREATE TABLE rename_partition_table0 (key STRING, value STRING) PARTITIONED BY 
(part STRING) STORED AS ORC;
INSERT OVERWRITE TABLE rename_partition_table0 PARTITION (part = '1') SELECT * 
FROM src where rand(1) < 0.5;
ALTER TABLE rename_partition_table0 ADD COLUMNS (new_col INT);
INSERT OVERWRITE TABLE rename_partition_table0 PARTITION (part = '2') SELECT 
src.*, 1 FROM src;
{code}
 

 


> Invalid reference to the newly added column
> ---
>
> Key: HIVE-28012
> URL: https://issues.apache.org/jira/browse/HIVE-28012
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhihua Deng
>Priority: Major
>
> Steps to repro:
> {code:java}
> --! qt:dataset:src
> --! qt:dataset:part
> set hive.stats.autogather=true;
> set hive.stats.column.autogather=true;
> set 
> metastore.metadata.transformer.class=org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer;
> set hive.metastore.client.capabilities=HIVEFULLACIDWRITE,HIVEFULLACIDREAD;
> set hive.create.as.external.legacy=true;
> set hive.support.concurrency=true;
> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> CREATE TABLE rename_partition_table0 (key STRING, value STRING) PARTITIONED 
> BY (part STRING) STORED AS ORC;
> INSERT OVERWRITE TABLE rename_partition_table0 PARTITION (part = '1') SELECT 
> * FROM src where rand(1) < 0.5;
> ALTER TABLE rename_partition_table0 ADD COLUMNS (new_col INT);
> INSERT OVERWRITE TABLE rename_partition_table0 PARTITION (part = '2') SELECT 
> src.*, 1 FROM src;
> {code}
> Set hive.metastore.client.cache.v2.enabled=false can act as a workaround.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-28012) Invalid reference to the newly added column

2024-01-18 Thread Zhihua Deng (Jira)

Zhihua Deng created HIVE-28012:
--

 Summary: Invalid reference to the newly added column
 Key: HIVE-28012
 URL: https://issues.apache.org/jira/browse/HIVE-28012
 Project: Hive
  Issue Type: Bug
Reporter: Zhihua Deng


Steps to repro:

 
{code:java}
--! qt:dataset:src
--! qt:dataset:part
set hive.stats.autogather=true;
set hive.stats.column.autogather=true;
set 
metastore.metadata.transformer.class=org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer;
set hive.metastore.client.capabilities=HIVEFULLACIDWRITE,HIVEFULLACIDREAD;
set hive.create.as.external.legacy=true;
set hive.support.concurrency=true;
set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;

CREATE TABLE rename_partition_table0 (key STRING, value STRING) PARTITIONED BY 
(part STRING) STORED AS ORC;
INSERT OVERWRITE TABLE rename_partition_table0 PARTITION (part = '1') SELECT * 
FROM src where rand(1) < 0.5;
ALTER TABLE rename_partition_table0 ADD COLUMNS (new_col INT);
INSERT OVERWRITE TABLE rename_partition_table0 PARTITION (part = '2') SELECT 
src.*, 1 FROM src;
{code}
 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (HIVE-28011) Update the table info in PART_COL_STATS directly in case of table rename

2024-01-18 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng reassigned HIVE-28011:
--

Assignee: Zhihua Deng

> Update the table info in PART_COL_STATS directly in case of table rename
> 
>
> Key: HIVE-28011
> URL: https://issues.apache.org/jira/browse/HIVE-28011
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>
> Following the discussion on 
> [https://github.com/apache/hive/pull/4995#issuecomment-1899477224,] there are 
> still some rooms for performance tuning in case of table rename, that is, we 
> don't need to fetch all the column statistics from database then update them 
> in batch.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-28011) Update the table info in PART_COL_STATS directly in case of table rename

2024-01-18 Thread Zhihua Deng (Jira)

Zhihua Deng created HIVE-28011:
--

 Summary: Update the table info in PART_COL_STATS directly in case 
of table rename
 Key: HIVE-28011
 URL: https://issues.apache.org/jira/browse/HIVE-28011
 Project: Hive
  Issue Type: Improvement
  Components: Standalone Metastore
Reporter: Zhihua Deng


Following the discussion on 
[https://github.com/apache/hive/pull/4995#issuecomment-1899477224,] there are 
still some rooms for performance tuning in case of table rename, that is, we 
don't need to fetch all the column statistics from database then update them in 
batch.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (HIVE-28001) Fix the flaky test TestLeaderElection

2024-01-17 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng resolved HIVE-28001.

Fix Version/s: 4.0.0
   Resolution: Fixed

Fix has been merged. Thank you for the review [~hemanth619]!

> Fix the flaky test TestLeaderElection
> -
>
> Key: HIVE-28001
> URL: https://issues.apache.org/jira/browse/HIVE-28001
> Project: Hive
>  Issue Type: Test
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> The TestLeaderElection is failing sometimes, example:
> [http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/master/2032/tests]
> [http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-4995/3/tests]
> [http://ci.hive.apache.org/job/hive-flaky-check/796/]
> They are failed because of:
> java.lang.AssertionError
>      at 
> org.apache.hadoop.hive.metastore.leader.TestLeaderElection.testLeaseLeaderElection(TestLeaderElection.java:121)
> {code:java}
> assertFalse(lockId1 == instance1.getLockId()); {code}
> Lock remains the same, however it's supposed to be changed after a new leader 
> is elected.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-28001) Fix the flaky test TestLeaderElection

2024-01-17 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-28001:
---
Description: 
The TestLeaderElection is failing sometimes, example:

[http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/master/2032/tests]

[http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-4995/3/tests]

[http://ci.hive.apache.org/job/hive-flaky-check/796/]

They are failed because of:
java.lang.AssertionError
     at 
org.apache.hadoop.hive.metastore.leader.TestLeaderElection.testLeaseLeaderElection(TestLeaderElection.java:121)
{code:java}
assertFalse(lockId1 == instance1.getLockId()); {code}
Lock remains the same, however it's supposed to be changed after a new leader 
is elected.
 

  was:
The TestLeaderElection is failing sometimes, example:

[http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/master/2032/tests]

[http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-4995/3/tests]

[http://ci.hive.apache.org/job/hive-flaky-check/796/]


> Fix the flaky test TestLeaderElection
> -
>
> Key: HIVE-28001
> URL: https://issues.apache.org/jira/browse/HIVE-28001
> Project: Hive
>  Issue Type: Test
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>
> The TestLeaderElection is failing sometimes, example:
> [http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/master/2032/tests]
> [http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-4995/3/tests]
> [http://ci.hive.apache.org/job/hive-flaky-check/796/]
> They are failed because of:
> java.lang.AssertionError
>      at 
> org.apache.hadoop.hive.metastore.leader.TestLeaderElection.testLeaseLeaderElection(TestLeaderElection.java:121)
> {code:java}
> assertFalse(lockId1 == instance1.getLockId()); {code}
> Lock remains the same, however it's supposed to be changed after a new leader 
> is elected.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-28001) Fix the flaky test TestLeaderElection

2024-01-16 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-28001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-28001:
---
Description: 
The TestLeaderElection is failing sometimes, example:

[http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/master/2032/tests]

[http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-4995/3/tests]

[http://ci.hive.apache.org/job/hive-flaky-check/796/]

  was:
The TestLeaderElection is failing sometimes, example:

[http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/master/2032/tests]

[http://ci.hive.apache.org/job/hive-flaky-check/796/]


> Fix the flaky test TestLeaderElection
> -
>
> Key: HIVE-28001
> URL: https://issues.apache.org/jira/browse/HIVE-28001
> Project: Hive
>  Issue Type: Test
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>
> The TestLeaderElection is failing sometimes, example:
> [http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/master/2032/tests]
> [http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-4995/3/tests]
> [http://ci.hive.apache.org/job/hive-flaky-check/796/]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-28001) Fix the flaky test TestLeaderElection

2024-01-15 Thread Zhihua Deng (Jira)

Zhihua Deng created HIVE-28001:
--

 Summary: Fix the flaky test TestLeaderElection
 Key: HIVE-28001
 URL: https://issues.apache.org/jira/browse/HIVE-28001
 Project: Hive
  Issue Type: Test
Reporter: Zhihua Deng
Assignee: Zhihua Deng


The TestLeaderElection is failing sometimes, example:

[http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/master/2032/tests]

[http://ci.hive.apache.org/job/hive-flaky-check/796/]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (HIVE-27955) Missing Postgres driver when start services from Docker compose

2024-01-15 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng resolved HIVE-27955.

Fix Version/s: 4.0.0
   Resolution: Fixed

> Missing Postgres driver when start services from Docker compose
> ---
>
> Key: HIVE-27955
> URL: https://issues.apache.org/jira/browse/HIVE-27955
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> {noformat}
> 2023-12-13T15:24:17.52148Z SLF4J: See 
> http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
> 2023-12-13T15:24:17.590761756Z SLF4J: Actual binding is of type 
> [org.apache.logging.slf4j.Log4jLoggerFactory]
> 2023-12-13T15:24:18.033083131Z Metastore connection URL:   
> jdbc:postgresql://postgres:5432/metastore_db
> 2023-12-13T15:24:18.033102839Z Metastore connection Driver :   
> org.postgresql.Driver
> 2023-12-13T15:24:18.033105006Z Metastore connection User:  hive
> 2023-12-13T15:24:18.037759839Z Initializing the schema to: 4.0.0-beta-2
> 2023-12-13T15:24:18.037804339Z Metastore connection URL:   
> jdbc:postgresql://postgres:5432/metastore_db
> 2023-12-13T15:24:18.037826339Z Metastore connection Driver :   
> org.postgresql.Driver
> 2023-12-13T15:24:18.037839673Z Metastore connection User:  hive
> 2023-12-13T15:24:18.038597089Z Failed to load driver
> 2023-12-13T15:24:18.038604881Z Underlying cause: 
> java.lang.ClassNotFoundException : org.postgresql.Driver
> 2023-12-13T15:24:18.038673423Z Use --verbose for detailed stacktrace.
> 2023-12-13T15:24:18.038678756Z *** schemaTool failed ***
> 2023-12-13T15:24:18.095206673Z + '[' 1 -eq 0 ']'
> 2023-12-13T15:24:18.095225048Z + echo 'Schema initialization 
> failed!'{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-27994) Optimize renaming the partitioned table

2024-01-10 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-27994:
---
Description: 
In case of table rename, every row in PART_COL_STATS associated with the table 
should be fetched, stored in memory, delete & re-insert with new db/table name, 
this could take hours if the table has thousands of column statistics in 
PART_COL_STATS.

 

  was:
In case of table rename, every row in PART_COL_STATS associated with the table 
should be fetched, stored in memory, delete & re-insert with new db/table name, 
this could take hours if the table have thousands of column statistics in 
PART_COL_STATS.

 


> Optimize renaming the partitioned table
> ---
>
> Key: HIVE-27994
> URL: https://issues.apache.org/jira/browse/HIVE-27994
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>
> In case of table rename, every row in PART_COL_STATS associated with the 
> table should be fetched, stored in memory, delete & re-insert with new 
> db/table name, this could take hours if the table has thousands of column 
> statistics in PART_COL_STATS.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-27994) Optimize renaming the partitioned table

2024-01-10 Thread Zhihua Deng (Jira)

Zhihua Deng created HIVE-27994:
--

 Summary: Optimize renaming the partitioned table
 Key: HIVE-27994
 URL: https://issues.apache.org/jira/browse/HIVE-27994
 Project: Hive
  Issue Type: Improvement
Reporter: Zhihua Deng
Assignee: Zhihua Deng


In case of table rename, every row in PART_COL_STATS associated with the table 
should be fetched, stored in memory, delete & re-insert with new db/table name, 
this could take hours if the table have thousands of column statistics in 
PART_COL_STATS.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HIVE-27969) Add verbose logging for schematool and metastore service for Docker container

2024-01-04 Thread Zhihua Deng (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-27969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17803160#comment-17803160
 ] 

Zhihua Deng commented on HIVE-27969:


Fix has been merged into master. Thank you for the PR [~akshatm]!

> Add verbose logging for schematool and metastore service for Docker container
> -
>
> Key: HIVE-27969
> URL: https://issues.apache.org/jira/browse/HIVE-27969
> Project: Hive
>  Issue Type: Improvement
>Reporter: Akshat Mathur
>Assignee: Akshat Mathur
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Adding capability to print verbose logs for schematool and metastore service 
> inside docker container.
>  
> Note: hiveserver2 doesnt support verbose option.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (HIVE-27969) Add verbose logging for schematool and metastore service for Docker container

2024-01-04 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng resolved HIVE-27969.

Fix Version/s: 4.0.0
   Resolution: Fixed

> Add verbose logging for schematool and metastore service for Docker container
> -
>
> Key: HIVE-27969
> URL: https://issues.apache.org/jira/browse/HIVE-27969
> Project: Hive
>  Issue Type: Improvement
>Reporter: Akshat Mathur
>Assignee: Akshat Mathur
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Adding capability to print verbose logs for schematool and metastore service 
> inside docker container.
>  
> Note: hiveserver2 doesnt support verbose option.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (HIVE-27955) Missing Postgres driver when start services from Docker compose

2023-12-13 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng reassigned HIVE-27955:
--

Assignee: Zhihua Deng

> Missing Postgres driver when start services from Docker compose
> ---
>
> Key: HIVE-27955
> URL: https://issues.apache.org/jira/browse/HIVE-27955
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>
> {noformat}
> 2023-12-13T15:24:17.52148Z SLF4J: See 
> http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
> 2023-12-13T15:24:17.590761756Z SLF4J: Actual binding is of type 
> [org.apache.logging.slf4j.Log4jLoggerFactory]
> 2023-12-13T15:24:18.033083131Z Metastore connection URL:   
> jdbc:postgresql://postgres:5432/metastore_db
> 2023-12-13T15:24:18.033102839Z Metastore connection Driver :   
> org.postgresql.Driver
> 2023-12-13T15:24:18.033105006Z Metastore connection User:  hive
> 2023-12-13T15:24:18.037759839Z Initializing the schema to: 4.0.0-beta-2
> 2023-12-13T15:24:18.037804339Z Metastore connection URL:   
> jdbc:postgresql://postgres:5432/metastore_db
> 2023-12-13T15:24:18.037826339Z Metastore connection Driver :   
> org.postgresql.Driver
> 2023-12-13T15:24:18.037839673Z Metastore connection User:  hive
> 2023-12-13T15:24:18.038597089Z Failed to load driver
> 2023-12-13T15:24:18.038604881Z Underlying cause: 
> java.lang.ClassNotFoundException : org.postgresql.Driver
> 2023-12-13T15:24:18.038673423Z Use --verbose for detailed stacktrace.
> 2023-12-13T15:24:18.038678756Z *** schemaTool failed ***
> 2023-12-13T15:24:18.095206673Z + '[' 1 -eq 0 ']'
> 2023-12-13T15:24:18.095225048Z + echo 'Schema initialization 
> failed!'{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-27955) Missing Postgres driver when start services from Docker compose

2023-12-13 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-27955:
---
Parent: HIVE-26965
Issue Type: Sub-task  (was: Bug)

> Missing Postgres driver when start services from Docker compose
> ---
>
> Key: HIVE-27955
> URL: https://issues.apache.org/jira/browse/HIVE-27955
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zhihua Deng
>Priority: Major
>
> {noformat}
> 2023-12-13T15:24:17.52148Z SLF4J: See 
> http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
> 2023-12-13T15:24:17.590761756Z SLF4J: Actual binding is of type 
> [org.apache.logging.slf4j.Log4jLoggerFactory]
> 2023-12-13T15:24:18.033083131Z Metastore connection URL:   
> jdbc:postgresql://postgres:5432/metastore_db
> 2023-12-13T15:24:18.033102839Z Metastore connection Driver :   
> org.postgresql.Driver
> 2023-12-13T15:24:18.033105006Z Metastore connection User:  hive
> 2023-12-13T15:24:18.037759839Z Initializing the schema to: 4.0.0-beta-2
> 2023-12-13T15:24:18.037804339Z Metastore connection URL:   
> jdbc:postgresql://postgres:5432/metastore_db
> 2023-12-13T15:24:18.037826339Z Metastore connection Driver :   
> org.postgresql.Driver
> 2023-12-13T15:24:18.037839673Z Metastore connection User:  hive
> 2023-12-13T15:24:18.038597089Z Failed to load driver
> 2023-12-13T15:24:18.038604881Z Underlying cause: 
> java.lang.ClassNotFoundException : org.postgresql.Driver
> 2023-12-13T15:24:18.038673423Z Use --verbose for detailed stacktrace.
> 2023-12-13T15:24:18.038678756Z *** schemaTool failed ***
> 2023-12-13T15:24:18.095206673Z + '[' 1 -eq 0 ']'
> 2023-12-13T15:24:18.095225048Z + echo 'Schema initialization 
> failed!'{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-27955) Missing Postgres driver when start services from Docker compose

2023-12-13 Thread Zhihua Deng (Jira)

Zhihua Deng created HIVE-27955:
--

 Summary: Missing Postgres driver when start services from Docker 
compose
 Key: HIVE-27955
 URL: https://issues.apache.org/jira/browse/HIVE-27955
 Project: Hive
  Issue Type: Bug
Reporter: Zhihua Deng


{noformat}
2023-12-13T15:24:17.52148Z SLF4J: See 
http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
2023-12-13T15:24:17.590761756Z SLF4J: Actual binding is of type 
[org.apache.logging.slf4j.Log4jLoggerFactory]
2023-12-13T15:24:18.033083131Z Metastore connection URL: 
jdbc:postgresql://postgres:5432/metastore_db
2023-12-13T15:24:18.033102839Z Metastore connection Driver : 
org.postgresql.Driver
2023-12-13T15:24:18.033105006Z Metastore connection User:hive
2023-12-13T15:24:18.037759839Z Initializing the schema to: 4.0.0-beta-2
2023-12-13T15:24:18.037804339Z Metastore connection URL: 
jdbc:postgresql://postgres:5432/metastore_db
2023-12-13T15:24:18.037826339Z Metastore connection Driver : 
org.postgresql.Driver
2023-12-13T15:24:18.037839673Z Metastore connection User:hive
2023-12-13T15:24:18.038597089Z Failed to load driver
2023-12-13T15:24:18.038604881Z Underlying cause: 
java.lang.ClassNotFoundException : org.postgresql.Driver
2023-12-13T15:24:18.038673423Z Use --verbose for detailed stacktrace.
2023-12-13T15:24:18.038678756Z *** schemaTool failed ***
2023-12-13T15:24:18.095206673Z + '[' 1 -eq 0 ']'
2023-12-13T15:24:18.095225048Z + echo 'Schema initialization failed!'{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-27555) Upgrade issues with Kudu table on backend db

2023-12-09 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-27555:
---
Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Fix has been merged to master. Thank you [~aturoczy], [~dkuzmenko] for the 
review!

> Upgrade issues with Kudu table on backend db
> 
>
> Key: HIVE-27555
> URL: https://issues.apache.org/jira/browse/HIVE-27555
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0-beta-1
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Critical
>  Labels: hive-4.0.0-must, pull-request-available
> Fix For: 4.0.0
>
>
> In HIVE-27457, we try to update the serde lib, (input/output)format of the 
> kudu table in back db. In the upgrade scripts, we join the  "SDS"."SD_ID" 
> with "TABLE_PARAMS"."TBL_ID", 
> https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/sql/mysql/upgrade-4.0.0-alpha-2-to-4.0.0-beta-1.mysql.sql#L37-L39
> as "SD_ID" is the primary key of SDS, and "TBL_ID" is the primary key of 
> TBLS, we can't join the two tables using these two columns.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (HIVE-27775) DirectSQL and JDO results are different when fetching partitions by timestamp in DST shift

2023-12-07 Thread Zhihua Deng (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-27775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17794502#comment-17794502
 ] 

Zhihua Deng edited comment on HIVE-27775 at 12/8/23 3:34 AM:
-

On Jdo path, we use the partition name to fetch the matched partitions, 
{code:java}
SELECT DISTINCT 'org.apache.hadoop.hive.metastore.model.MPartition' AS 
DN_TYPE,A0.CREATE_TIME,A0.LAST_ACCESS_TIME,A0.PART_NAME AS 
NUCORDER0,A0.WRITE_ID,A0.PART_ID FROM PARTITIONS A0 LEFT OUTER JOIN TBLS B0 ON 
A0.TBL_ID = B0.TBL_ID LEFT OUTER JOIN DBS C0 ON B0.DB_ID = C0.DB_ID WHERE 
B0.TBL_NAME = <'payments'> AND C0."NAME" = <'default'> AND C0.CTLG_NAME = 
<'hive'> AND A0.PART_NAME = <'txn_datetime=2023-03-26 03%3A30%3A00'> ORDER BY 
NUCORDER0 {code}
In the above example, the filter A0.PART_NAME = <'txn_datetime=2023-03-26 
03%3A30%3A00'> A0.PART_NAME is timezone agnostic, this could lead to wrong 
result if <'txn_datetime=2023-03-26 03%3A30%3A00'> is timezone based partition 
name.

Compared to direct mode, 
{code:java}
select "PARTITIONS"."PART_ID" from "PARTITIONS"  inner join "TBLS" on 
"PARTITIONS"."TBL_ID" = "TBLS"."TBL_ID"     and "TBLS"."TBL_NAME" = 
<'payments'>   inner join "DBS" on "TBLS"."DB_ID" = "DBS"."DB_ID"      and 
"DBS"."NAME" = <'default'> inner join "PARTITION_KEY_VALS" "FILTER0" on 
"FILTER0"."PART_ID" = "PARTITIONS"."PART_ID" and "FILTER0"."INTEGER_IDX" = 0 
where "DBS"."CTLG_NAME" = <'hive'>  and ((cast((case when 
"FILTER0"."PART_KEY_VAL" <> <'__HIVE_DEFAULT_PARTITION__'> and 
"TBLS"."TBL_NAME" = <'payments'> and "DBS"."NAME" = <'default'> and 
"DBS"."CTLG_NAME" = <'hive'> and "FILTER0"."PART_ID" = "PARTITIONS"."PART_ID" 
and "FILTER0"."INTEGER_IDX" = 0 then cast("FILTER0"."PART_KEY_VAL" as 
TIMESTAMP) else null end) as TIMESTAMP) = <'2023-03-26 03:30:00'>)) {code}
the filter is cast("FILTER0"."PART_KEY_VAL" as TIMESTAMP) else null end) as 
TIMESTAMP) = <'2023-03-26 03:30:00'>, so if we push a timezone based 
<'2023-03-26 03:30:00'> and the underlying database can handle 
cast("FILTER0"."PART_KEY_VAL" as TIMESTAMP) else null end) as TIMESTAMP) 
properly, then we could get the expected result.

However when I switch the backing db to Postgres, the direct sql throws an 
exception:
{noformat}
Caused by: org.postgresql.util.PSQLException: ERROR: operator does not exist: 
timestamp without time zone = character varying
  Hint: No operator matches the given name and argument types. You might need 
to add explicit type casts.
  Position: 662{noformat}
MySQL as well:
{code:java}
Caused by: java.sql.SQLSyntaxErrorException: You have an error in your SQL 
syntax; check the manual that corresponds to your MySQL server version for the 
right syntax to use near 'TIMESTAMP) else null end) as TIMESTAMP) = '2023-03-26 
03:30:00'))' at line 1 {code}
So I think we should fix the errors on direct sql path and make the 
timestamp/date timezone agnostic on JDO path.


was (Author: dengzh):
On Jdo path, we use the partition name to fetch the matched partitions, 
{code:java}
SELECT DISTINCT 'org.apache.hadoop.hive.metastore.model.MPartition' AS 
DN_TYPE,A0.CREATE_TIME,A0.LAST_ACCESS_TIME,A0.PART_NAME AS 
NUCORDER0,A0.WRITE_ID,A0.PART_ID FROM PARTITIONS A0 LEFT OUTER JOIN TBLS B0 ON 
A0.TBL_ID = B0.TBL_ID LEFT OUTER JOIN DBS C0 ON B0.DB_ID = C0.DB_ID WHERE 
B0.TBL_NAME = <'payments'> AND C0."NAME" = <'default'> AND C0.CTLG_NAME = 
<'hive'> AND A0.PART_NAME = <'txn_datetime=2023-03-26 03%3A30%3A00'> ORDER BY 
NUCORDER0 {code}
In the above example, the filter A0.PART_NAME = <'txn_datetime=2023-03-26 
03%3A30%3A00'> A0.PART_NAME is timezone agnostic, this could lead to wrong 
result if <'txn_datetime=2023-03-26 03%3A30%3A00'> is timezone based timestamp.

Compared to direct mode, 
{code:java}
select "PARTITIONS"."PART_ID" from "PARTITIONS"  inner join "TBLS" on 
"PARTITIONS"."TBL_ID" = "TBLS"."TBL_ID"     and "TBLS"."TBL_NAME" = 
<'payments'>   inner join "DBS" on "TBLS"."DB_ID" = "DBS"."DB_ID"      and 
"DBS"."NAME" = <'default'> inner join "PARTITION_KEY_VALS" "FILTER0" on 
"FILTER0"."PART_ID" = "PARTITIONS"."PART_ID" and "FILTER0"."INTEGER_IDX" = 0 
where "DBS"."CTLG_NAME" = <'hive'>  and ((cast((case when 
"FILTER0"."PART_KEY_VAL" <> <'__HIVE_DEFAULT_PARTITION__'> and 
"TBLS"."TBL_NAME" = <'payments'> and "DBS"."NAME" = <'default'> and 
"DBS"."CTLG_NAME" = <'hive'> and "FILTER0"."PART_ID" = "PARTITIONS"."PART_ID" 
and "FILTER0"."INTEGER_IDX" = 0 then cast("FILTER0"."PART_KEY_VAL" as 
TIMESTAMP) else null end) as TIMESTAMP) = <'2023-03-26 03:30:00'>)) {code}
the filter is cast("FILTER0"."PART_KEY_VAL" as TIMESTAMP) else null end) as 
TIMESTAMP) = <'2023-03-26 03:30:00'>, so if we push a timezone based 
<'2023-03-26 03:30:00'> and the underlying database can handle 
cast("FILTER0"."PART_KEY_VAL" as TIMESTAMP) else null end) as TIMESTAMP) 
properly, then we could get the expected result.

However when I switch the backing db to Postgres, the

[jira] [Commented] (HIVE-27775) DirectSQL and JDO results are different when fetching partitions by timestamp in DST shift

2023-12-07 Thread Zhihua Deng (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-27775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17794502#comment-17794502
 ] 

Zhihua Deng commented on HIVE-27775:


On Jdo path, we use the partition name to fetch the matched partitions, 
{code:java}
SELECT DISTINCT 'org.apache.hadoop.hive.metastore.model.MPartition' AS 
DN_TYPE,A0.CREATE_TIME,A0.LAST_ACCESS_TIME,A0.PART_NAME AS 
NUCORDER0,A0.WRITE_ID,A0.PART_ID FROM PARTITIONS A0 LEFT OUTER JOIN TBLS B0 ON 
A0.TBL_ID = B0.TBL_ID LEFT OUTER JOIN DBS C0 ON B0.DB_ID = C0.DB_ID WHERE 
B0.TBL_NAME = <'payments'> AND C0."NAME" = <'default'> AND C0.CTLG_NAME = 
<'hive'> AND A0.PART_NAME = <'txn_datetime=2023-03-26 03%3A30%3A00'> ORDER BY 
NUCORDER0 {code}
In the above example, the filter A0.PART_NAME = <'txn_datetime=2023-03-26 
03%3A30%3A00'> A0.PART_NAME is timezone agnostic, this could lead to wrong 
result if <'txn_datetime=2023-03-26 03%3A30%3A00'> is timezone based timestamp.

Compared to direct mode, 
{code:java}
select "PARTITIONS"."PART_ID" from "PARTITIONS"  inner join "TBLS" on 
"PARTITIONS"."TBL_ID" = "TBLS"."TBL_ID"     and "TBLS"."TBL_NAME" = 
<'payments'>   inner join "DBS" on "TBLS"."DB_ID" = "DBS"."DB_ID"      and 
"DBS"."NAME" = <'default'> inner join "PARTITION_KEY_VALS" "FILTER0" on 
"FILTER0"."PART_ID" = "PARTITIONS"."PART_ID" and "FILTER0"."INTEGER_IDX" = 0 
where "DBS"."CTLG_NAME" = <'hive'>  and ((cast((case when 
"FILTER0"."PART_KEY_VAL" <> <'__HIVE_DEFAULT_PARTITION__'> and 
"TBLS"."TBL_NAME" = <'payments'> and "DBS"."NAME" = <'default'> and 
"DBS"."CTLG_NAME" = <'hive'> and "FILTER0"."PART_ID" = "PARTITIONS"."PART_ID" 
and "FILTER0"."INTEGER_IDX" = 0 then cast("FILTER0"."PART_KEY_VAL" as 
TIMESTAMP) else null end) as TIMESTAMP) = <'2023-03-26 03:30:00'>)) {code}
the filter is cast("FILTER0"."PART_KEY_VAL" as TIMESTAMP) else null end) as 
TIMESTAMP) = <'2023-03-26 03:30:00'>, so if we push a timezone based 
<'2023-03-26 03:30:00'> and the underlying database can handle 
cast("FILTER0"."PART_KEY_VAL" as TIMESTAMP) else null end) as TIMESTAMP) 
properly, then we could get the expected result.

However when I switch the backing db to Postgres, the direct sql throws an 
exception:
{noformat}
Caused by: org.postgresql.util.PSQLException: ERROR: operator does not exist: 
timestamp without time zone = character varying
  Hint: No operator matches the given name and argument types. You might need 
to add explicit type casts.
  Position: 662{noformat}
MySQL as well:
{code:java}
Caused by: java.sql.SQLSyntaxErrorException: You have an error in your SQL 
syntax; check the manual that corresponds to your MySQL server version for the 
right syntax to use near 'TIMESTAMP) else null end) as TIMESTAMP) = '2023-03-26 
03:30:00'))' at line 1 {code}
So I think we should fix the errors on direct sql path and make the 
timestamp/date timezone agnostic on JDO path.

> DirectSQL and JDO results are different when fetching partitions by timestamp 
> in DST shift
> --
>
> Key: HIVE-27775
> URL: https://issues.apache.org/jira/browse/HIVE-27775
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 4.0.0-beta-1
>Reporter: Stamatis Zampetakis
>Assignee: Zhihua Deng
>Priority: Critical
>
> DirectSQL and JDO results are different when fetching partitions by timestamp 
> in DST shift.
> {code:sql}
> --! qt:timezone:Europe/Paris
> CREATE EXTERNAL TABLE payments (card string) PARTITIONED BY(txn_datetime 
> TIMESTAMP) STORED AS ORC;
> INSERT into payments VALUES('---', '2023-03-26 02:30:00');
> SELECT * FROM payments WHERE txn_datetime = '2023-03-26 02:30:00';
> {code}
> The '2023-03-26 02:30:00' is a timestamp that in Europe/Paris timezone falls 
> exactly in the middle of the DST shift. In this particular timezone this date 
> time never really exists since we are jumping directly from 02:00:00 to 
> 03:00:00. However, the TIMESTAMP data type in Hive is timezone agnostic 
> (https://cwiki.apache.org/confluence/display/Hive/Different+TIMESTAMP+types) 
> so it is a perfectly valid timestamp that can be inserted in a table and we 
> must be able to recover it back.
> For the SELECT query above, partition pruning kicks in and calls the 
> ObjectStore#getPartitionsByExpr method in order to fetch the respective 
> partitions matching the timestamp from HMS.
> The tests however reveal that DirectSQL and JDO paths are not returning the 
> same results leading to an exception when VerifyingObjectStore is used. 
> According to the error below DirectSQL is able to recover one partition from 
> HMS (expected) while JDO/ORM returns empty (not expected).
> {noformat}
> 2023-10-06T03:51:19,406 ERROR [80252df4-3fdc-4971-badf-ad67ce8567c7 main] 
> metastore.VerifyingObjectStore:

[jira] [Assigned] (HIVE-19818) SessionState getQueryId returns an empty string

2023-11-27 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-19818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng reassigned HIVE-19818:
--

Assignee: (was: Zhihua Deng)

> SessionState getQueryId returns an empty string
> ---
>
> Key: HIVE-19818
> URL: https://issues.apache.org/jira/browse/HIVE-19818
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.2
>Reporter: Zhihua Deng
>Priority: Minor
> Attachments: HIVE-19818.patch
>
>
> When we execute sql asynchronously,  a new configuration based on the session 
> holds will be created and passed to the driver instance, which resulting to 
> return an empty string when SessionState#getQueryId called later on. This 
> problem can be seen in HadoopJobExecHelper.java.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (HIVE-24511) Potential classloader leak in SerDeStorageSchemaReader and add JsonSerde to managed serde

2023-11-27 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng reassigned HIVE-24511:
--

Assignee: (was: Zhihua Deng)

> Potential classloader leak in SerDeStorageSchemaReader and add JsonSerde to 
> managed serde
> -
>
> Key: HIVE-24511
> URL: https://issues.apache.org/jira/browse/HIVE-24511
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: Zhihua Deng
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> 1,  Close the created classloader to release resources.
> 2,  More detail error messages on MetaException when throwing.
> 3,  Skip JsonSerDe/RegexSerDe creation when get columns/schemas of such 
> tables.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (HIVE-24422) Throw SemanticException when CTE alias is conflicted with table name

2023-11-27 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng reassigned HIVE-24422:
--

Assignee: (was: Zhihua Deng)

> Throw SemanticException when CTE alias is conflicted with table name
> 
>
> Key: HIVE-24422
> URL: https://issues.apache.org/jira/browse/HIVE-24422
> Project: Hive
>  Issue Type: Improvement
>  Components: Parser
>Reporter: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> If the alias of CTE is conflicted with the table name, we use the alias 
> fetching the table other than replacing it with the ASTNode tree, this may 
> cause some confusing problems. For example:
> {noformat}
> create table game_info (game_name string);
> with game_info as (
> select distinct ext_id, dev_app_id, game_name
> from game_info_extend )
> select count(game_name) from game_info;{noformat}
> The query will return the number of rows of the table game_info, instead of 
> the game_info_extend. Maybe we should better throw an exception to avoid such 
> cases.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

1 2 3 4 5 6 7 8 >

1 - 100 of 797 matches

Mail list logo