[jira] [Created] (HIVE-27400) UPgrade jetty-server to 11.0.15

2023-06-01 Thread Akshat Mathur (Jira)
Akshat Mathur created HIVE-27400:


 Summary: UPgrade jetty-server to 11.0.15
 Key: HIVE-27400
 URL: https://issues.apache.org/jira/browse/HIVE-27400
 Project: Hive
  Issue Type: Improvement
Reporter: Akshat Mathur


Due to multiple CVEs in the current version 9.4.40.v20210413m upgrade 
jetty-server to 11.0.15



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27400) Upgrade jetty-server to 11.0.15

2023-06-01 Thread Akshat Mathur (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akshat Mathur updated HIVE-27400:
-
Summary: Upgrade jetty-server to 11.0.15  (was: UPgrade jetty-server to 
11.0.15)

> Upgrade jetty-server to 11.0.15
> ---
>
> Key: HIVE-27400
> URL: https://issues.apache.org/jira/browse/HIVE-27400
> Project: Hive
>  Issue Type: Improvement
>Reporter: Akshat Mathur
>Priority: Major
>
> Due to multiple CVEs in the current version 9.4.40.v20210413m upgrade 
> jetty-server to 11.0.15



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work started] (HIVE-27355) Iceberg: Create table can be slow due to file listing for stats

2023-06-01 Thread Dmitriy Fingerman (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-27355 started by Dmitriy Fingerman.

> Iceberg: Create table can be slow due to file listing for stats
> ---
>
> Key: HIVE-27355
> URL: https://issues.apache.org/jira/browse/HIVE-27355
> Project: Hive
>  Issue Type: Improvement
>  Components: Iceberg integration
>Reporter: Rajesh Balamohan
>Assignee: Dmitriy Fingerman
>Priority: Major
>  Labels: pull-request-available
>
> Stacktrace can be different for hive master branch. But issue is, stats need 
> not be populated for iceberg tables and currently it is doing recursive calls 
> causing delays during table creation (e.g CTAS).
>  
> {noformat}
> at 
> org.apache.hadoop.hive.common.FileUtils.listStatusRecursively(FileUtils.java:329)
>   at 
> org.apache.hadoop.hive.common.FileUtils.listStatusRecursively(FileUtils.java:330)
>   at 
> org.apache.hadoop.hive.common.FileUtils.listStatusRecursively(FileUtils.java:330)
>   at 
> org.apache.hadoop.hive.common.HiveStatsUtils.getFileStatusRecurse(HiveStatsUtils.java:61)
>   at 
> org.apache.hadoop.hive.metastore.Warehouse.getFileStatusesForUnpartitionedTable(Warehouse.java:581)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.updateTableStatsFast(MetaStoreUtils.java:201)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.updateTableStatsFast(MetaStoreUtils.java:194)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1445)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1502)
>   at sun.reflect.GeneratedMethodAccessor118.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
>   at com.sun.proxy.$Proxy49.create_table_with_environment_context(Unknown 
> Source)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.create_table_with_environment_context(HiveMetaStoreClient.java:2419)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:755)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:743)
>   at sun.reflect.GeneratedMethodAccessor117.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>  {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-27355) Iceberg: Create table can be slow due to file listing for stats

2023-06-01 Thread Dmitriy Fingerman (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Fingerman reassigned HIVE-27355:


Assignee: Dmitriy Fingerman

> Iceberg: Create table can be slow due to file listing for stats
> ---
>
> Key: HIVE-27355
> URL: https://issues.apache.org/jira/browse/HIVE-27355
> Project: Hive
>  Issue Type: Improvement
>  Components: Iceberg integration
>Reporter: Rajesh Balamohan
>Assignee: Dmitriy Fingerman
>Priority: Major
>  Labels: pull-request-available
>
> Stacktrace can be different for hive master branch. But issue is, stats need 
> not be populated for iceberg tables and currently it is doing recursive calls 
> causing delays during table creation (e.g CTAS).
>  
> {noformat}
> at 
> org.apache.hadoop.hive.common.FileUtils.listStatusRecursively(FileUtils.java:329)
>   at 
> org.apache.hadoop.hive.common.FileUtils.listStatusRecursively(FileUtils.java:330)
>   at 
> org.apache.hadoop.hive.common.FileUtils.listStatusRecursively(FileUtils.java:330)
>   at 
> org.apache.hadoop.hive.common.HiveStatsUtils.getFileStatusRecurse(HiveStatsUtils.java:61)
>   at 
> org.apache.hadoop.hive.metastore.Warehouse.getFileStatusesForUnpartitionedTable(Warehouse.java:581)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.updateTableStatsFast(MetaStoreUtils.java:201)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.updateTableStatsFast(MetaStoreUtils.java:194)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1445)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1502)
>   at sun.reflect.GeneratedMethodAccessor118.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
>   at com.sun.proxy.$Proxy49.create_table_with_environment_context(Unknown 
> Source)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.create_table_with_environment_context(HiveMetaStoreClient.java:2419)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:755)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:743)
>   at sun.reflect.GeneratedMethodAccessor117.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>  {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-27399) Add lateral view support with separate CBO files

2023-06-01 Thread Steve Carlin (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Carlin reassigned HIVE-27399:
---

Assignee: Steve Carlin

> Add lateral view support with separate CBO files
> 
>
> Key: HIVE-27399
> URL: https://issues.apache.org/jira/browse/HIVE-27399
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Steve Carlin
>Assignee: Steve Carlin
>Priority: Major
>
> This subtask consists of the bulk of the work, but some cleanup will be 
> needed afterwards.
> The lateral views should be CBO enabled, but I'm gonna create new separate 
> files for the EXPLAIN CBO statements to make the review slightly easier.
> A later subtask will clean this up.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-27391) Refactor lateral views in CBO

2023-06-01 Thread Steve Carlin (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Carlin reassigned HIVE-27391:
---

Assignee: Steve Carlin

> Refactor lateral views in CBO 
> --
>
> Key: HIVE-27391
> URL: https://issues.apache.org/jira/browse/HIVE-27391
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Steve Carlin
>Assignee: Steve Carlin
>Priority: Major
>  Labels: pull-request-available
>
> We should refactor the creation of the Calcite RelNode for 
> HiveTableFunctionScan with lateral views to make it easier to develop support 
> for all lateral views.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-27390) Support all lateral views in CBO

2023-06-01 Thread Steve Carlin (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Carlin reassigned HIVE-27390:
---

Assignee: Steve Carlin

> Support all lateral views in CBO
> 
>
> Key: HIVE-27390
> URL: https://issues.apache.org/jira/browse/HIVE-27390
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO
>Reporter: Steve Carlin
>Assignee: Steve Carlin
>Priority: Major
>
> Currently, only the inline udtf with a values clause is supported for CBO.  
> We should support all udtf clauses



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27399) Add lateral view support with separate CBO files

2023-06-01 Thread Steve Carlin (Jira)
Steve Carlin created HIVE-27399:
---

 Summary: Add lateral view support with separate CBO files
 Key: HIVE-27399
 URL: https://issues.apache.org/jira/browse/HIVE-27399
 Project: Hive
  Issue Type: Sub-task
  Components: HiveServer2
Reporter: Steve Carlin


This subtask consists of the bulk of the work, but some cleanup will be needed 
afterwards.

The lateral views should be CBO enabled, but I'm gonna create new separate 
files for the EXPLAIN CBO statements to make the review slightly easier.

A later subtask will clean this up.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27398) SHOW CREATE TABLE doesn't output backticks for CLUSTERED by Col names

2023-06-01 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-27398:
--
Labels: pull-request-available  (was: )

> SHOW CREATE TABLE doesn't output backticks for CLUSTERED by Col names
> -
>
> Key: HIVE-27398
> URL: https://issues.apache.org/jira/browse/HIVE-27398
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Riju Trivedi
>Assignee: Riju Trivedi
>Priority: Minor
>  Labels: pull-request-available
>
> SHOW CREATE TABLE output uses backticks for all column names and partition 
> column names but does not include backticks for CLUSTERED BY column names. 
> This causes ParseException during table creation when any bucket column 
> identifier matches reserved keywords 
> {code:java}
> CREATE TABLE `test_ts_reserved_keyword7`(
> `member_id` varchar(8),
> `plan_nr` varchar(11),
> `timestamp` timestamp,
> `shared_ind` varchar(1),
> `user_id` varchar(8))
> CLUSTERED BY (
> member_nr,
> plan_nr,
> `timestamp`)
> INTO 4 BUCKETS;
> SHOW CREATE TABLE test_ts_reserved_keyword7;
> CREATE TABLE `test_ts_reserved_keyword7`(
> `member_id` varchar(8),
> `plan_nr` varchar(11),
> `timestamp` timestamp, 
> `shared_ind` varchar(1), 
> `user_id` varchar(8)) 
> CLUSTERED BY (
> member_id,
> plan_nr,
> timestamp)
> INTO 4 BUCKETS
> ROW FORMAT 
> SERDE'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'STORED AS 
> INPUTFORMAT'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'OUTPUTFORMAT'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat';
> This fails with "Error while compiling statement: FAILED: ParseException line 
> 13:0 cannot recognize input near 'timestamp' ')' 'INTO' in column name"{code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27398) SHOW CREATE TABLE doesn't output backticks for CLUSTERED by Col names

2023-06-01 Thread Riju Trivedi (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Riju Trivedi updated HIVE-27398:

Description: 
SHOW CREATE TABLE output uses backticks for all column names and partition 
column names but does not include backticks for CLUSTERED BY column names. This 
causes ParseException during table creation when any bucket column identifier 
matches reserved keywords 
{code:java}
CREATE TABLE `test_ts_reserved_keyword7`(
`member_id` varchar(8),
`plan_nr` varchar(11),
`timestamp` timestamp,
`shared_ind` varchar(1),
`user_id` varchar(8))
CLUSTERED BY (
member_nr,
plan_nr,
`timestamp`)
INTO 4 BUCKETS;

SHOW CREATE TABLE test_ts_reserved_keyword7;
CREATE TABLE `test_ts_reserved_keyword7`(
`member_id` varchar(8),
`plan_nr` varchar(11),
`timestamp` timestamp, 
`shared_ind` varchar(1), 
`user_id` varchar(8)) 
CLUSTERED BY (
member_id,
plan_nr,
timestamp)
INTO 4 BUCKETS
ROW FORMAT 
SERDE'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'STORED AS 
INPUTFORMAT'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'OUTPUTFORMAT'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat';

This fails with "Error while compiling statement: FAILED: ParseException line 
13:0 cannot recognize input near 'timestamp' ')' 'INTO' in column name"{code}
 

  was:
SHOW CREATE TABLE output uses backticks for all column names and partition 
column names but does not include backticks for CLUSTERED BY column names. This 
causes ParseException during table creation when any bucket column identifier 
matches reserved keywords 
{code:java}
CREATE TABLE `test_ts_reserved_keyword7`(
`member_id` varchar(8),
`plan_nr` varchar(11),
`timestamp` timestamp,
`shared_ind` varchar(1),
`user_id` varchar(8))
CLUSTERED BY (
member_nr,
plan_nr,
`timestamp`)
INTO 4 BUCKETS;

SHOW CREATE TABLE test_ts_reserved_keyword7;
CREATE TABLE `test_ts_reserved_keyword7`(
`member_id` varchar(8),
`plan_nr` varchar(11),
`timestamp` timestamp, 
`shared_ind` varchar(1), 
`user_id` varchar(8)) 
CLUSTERED BY (
member_id,
plan_nr,
timestamp)
INTO 4 BUCKETS
ROW FORMAT 
SERDE'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'STORED AS 
INPUTFORMAT'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'OUTPUTFORMAT'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat';
{code}
 


> SHOW CREATE TABLE doesn't output backticks for CLUSTERED by Col names
> -
>
> Key: HIVE-27398
> URL: https://issues.apache.org/jira/browse/HIVE-27398
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Riju Trivedi
>Assignee: Riju Trivedi
>Priority: Minor
>
> SHOW CREATE TABLE output uses backticks for all column names and partition 
> column names but does not include backticks for CLUSTERED BY column names. 
> This causes ParseException during table creation when any bucket column 
> identifier matches reserved keywords 
> {code:java}
> CREATE TABLE `test_ts_reserved_keyword7`(
> `member_id` varchar(8),
> `plan_nr` varchar(11),
> `timestamp` timestamp,
> `shared_ind` varchar(1),
> `user_id` varchar(8))
> CLUSTERED BY (
> member_nr,
> plan_nr,
> `timestamp`)
> INTO 4 BUCKETS;
> SHOW CREATE TABLE test_ts_reserved_keyword7;
> CREATE TABLE `test_ts_reserved_keyword7`(
> `member_id` varchar(8),
> `plan_nr` varchar(11),
> `timestamp` timestamp, 
> `shared_ind` varchar(1), 
> `user_id` varchar(8)) 
> CLUSTERED BY (
> member_id,
> plan_nr,
> timestamp)
> INTO 4 BUCKETS
> ROW FORMAT 
> SERDE'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'STORED AS 
> INPUTFORMAT'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'OUTPUTFORMAT'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat';
> This fails with "Error while compiling statement: FAILED: ParseException line 
> 13:0 cannot recognize input near 'timestamp' ')' 'INTO' in column name"{code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27398) SHOW CREATE TABLE doesn't output backticks for CLUSTERED by Col names

2023-06-01 Thread Riju Trivedi (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Riju Trivedi updated HIVE-27398:

Description: 
SHOW CREATE TABLE output uses backticks for all column names and partition 
column names but does not include backticks for CLUSTERED BY column names. This 
causes ParseException during table creation when any bucket column identifier 
matches reserved keywords 
{code:java}
CREATE TABLE `test_ts_reserved_keyword7`(
`member_id` varchar(8),
`plan_nr` varchar(11),
`timestamp` timestamp,
`shared_ind` varchar(1),
`user_id` varchar(8))
CLUSTERED BY (
member_nr,
plan_nr,
`timestamp`)
INTO 4 BUCKETS;

SHOW CREATE TABLE test_ts_reserved_keyword7;
CREATE TABLE `test_ts_reserved_keyword7`(
`member_id` varchar(8),
`plan_nr` varchar(11),
`timestamp` timestamp, 
`shared_ind` varchar(1), 
`user_id` varchar(8)) 
CLUSTERED BY (
member_id,
plan_nr,
timestamp)
INTO 4 BUCKETS
ROW FORMAT 
SERDE'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'STORED AS 
INPUTFORMAT'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'OUTPUTFORMAT'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat';
{code}
 

  was:
SHOW CREATE TABLE output uses backticks for all column names and partition 
column names but does not include backticks for CLUSTERED BY column names. This 
causes ParseException during table creation when any bucket column identifier 
matches reserved keywords when 
CREATE TABLE `test_ts_reserved_keyword7`(
`member_id` varchar(8),
`plan_nr` varchar(11),
`timestamp` timestamp,
`shared_ind` varchar(1),
`user_id` varchar(8))
CLUSTERED BY (
member_nr,
plan_nr,
`timestamp`)
INTO 4 BUCKETS;
SHOW CREATE TABLE OUTPUT
CREATE TABLE `test_ts_reserved_keyword7`(
`member_id` varchar(8),
 `plan_nr` varchar(11),
 `timestamp` timestamp, 
`shared_ind` varchar(1), 
`user_id` varchar(8)) 
CLUSTERED BY (
member_id,
plan_nr,
timestamp)
INTO 4 BUCKETS
ROW FORMAT 
SERDE'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'STORED AS 
INPUTFORMAT'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'OUTPUTFORMAT'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat';


> SHOW CREATE TABLE doesn't output backticks for CLUSTERED by Col names
> -
>
> Key: HIVE-27398
> URL: https://issues.apache.org/jira/browse/HIVE-27398
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Riju Trivedi
>Assignee: Riju Trivedi
>Priority: Minor
>
> SHOW CREATE TABLE output uses backticks for all column names and partition 
> column names but does not include backticks for CLUSTERED BY column names. 
> This causes ParseException during table creation when any bucket column 
> identifier matches reserved keywords 
> {code:java}
> CREATE TABLE `test_ts_reserved_keyword7`(
> `member_id` varchar(8),
> `plan_nr` varchar(11),
> `timestamp` timestamp,
> `shared_ind` varchar(1),
> `user_id` varchar(8))
> CLUSTERED BY (
> member_nr,
> plan_nr,
> `timestamp`)
> INTO 4 BUCKETS;
> SHOW CREATE TABLE test_ts_reserved_keyword7;
> CREATE TABLE `test_ts_reserved_keyword7`(
> `member_id` varchar(8),
> `plan_nr` varchar(11),
> `timestamp` timestamp, 
> `shared_ind` varchar(1), 
> `user_id` varchar(8)) 
> CLUSTERED BY (
> member_id,
> plan_nr,
> timestamp)
> INTO 4 BUCKETS
> ROW FORMAT 
> SERDE'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'STORED AS 
> INPUTFORMAT'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'OUTPUTFORMAT'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat';
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27398) SHOW CREATE TABLE doesn't output backticks for CLUSTERED by Col names

2023-06-01 Thread Riju Trivedi (Jira)
Riju Trivedi created HIVE-27398:
---

 Summary: SHOW CREATE TABLE doesn't output backticks for CLUSTERED 
by Col names
 Key: HIVE-27398
 URL: https://issues.apache.org/jira/browse/HIVE-27398
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Riju Trivedi
Assignee: Riju Trivedi


SHOW CREATE TABLE output uses backticks for all column names and partition 
column names but does not include backticks for CLUSTERED BY column names. This 
causes ParseException during table creation when any bucket column identifier 
matches reserved keywords when 
CREATE TABLE `test_ts_reserved_keyword7`(
`member_id` varchar(8),
`plan_nr` varchar(11),
`timestamp` timestamp,
`shared_ind` varchar(1),
`user_id` varchar(8))
CLUSTERED BY (
member_nr,
plan_nr,
`timestamp`)
INTO 4 BUCKETS;
SHOW CREATE TABLE OUTPUT
CREATE TABLE `test_ts_reserved_keyword7`(
`member_id` varchar(8),
 `plan_nr` varchar(11),
 `timestamp` timestamp, 
`shared_ind` varchar(1), 
`user_id` varchar(8)) 
CLUSTERED BY (
member_id,
plan_nr,
timestamp)
INTO 4 BUCKETS
ROW FORMAT 
SERDE'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'STORED AS 
INPUTFORMAT'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'OUTPUTFORMAT'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat';



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-22415) Upgrade to Java 11

2023-06-01 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17728376#comment-17728376
 ] 

Ayush Saxena commented on HIVE-22415:
-

Hi [~belugabehr] 

Do you plan to work on this? If not we can takeover

> Upgrade to Java 11
> --
>
> Key: HIVE-22415
> URL: https://issues.apache.org/jira/browse/HIVE-22415
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Upgrade Hive to Java JDK 11



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (HIVE-22415) Upgrade to Java 11

2023-06-01 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17728376#comment-17728376
 ] 

Ayush Saxena edited comment on HIVE-22415 at 6/1/23 1:47 PM:
-

Hi [~belugabehr] 

Do you plan to work on this? If not we can takeover and try experiment a bit


was (Author: ayushtkn):
Hi [~belugabehr] 

Do you plan to work on this? If not we can takeover

> Upgrade to Java 11
> --
>
> Key: HIVE-22415
> URL: https://issues.apache.org/jira/browse/HIVE-22415
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Upgrade Hive to Java JDK 11



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-27392) Iceberg: Use String instead of Long for file length in HadoopInputFile

2023-06-01 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena resolved HIVE-27392.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

> Iceberg: Use String instead of Long for file length in HadoopInputFile
> --
>
> Key: HIVE-27392
> URL: https://issues.apache.org/jira/browse/HIVE-27392
> Project: Hive
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Apply the workaround mentioned over here:
> https://issues.apache.org/jira/browse/HADOOP-18724?focusedCommentId=17718087=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17718087
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-27397) HivePreparedStatement cannot execute SQL with question marks in comments

2023-06-01 Thread yx91490 (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17728331#comment-17728331
 ] 

yx91490 commented on HIVE-27397:


the most suitable solution maybe move the parameter substitute process to 
server side, thus sql parser can remove the comments safely, but it's a big 
change.

> HivePreparedStatement cannot execute SQL with question marks in comments
> 
>
> Key: HIVE-27397
> URL: https://issues.apache.org/jira/browse/HIVE-27397
> Project: Hive
>  Issue Type: Bug
>Affects Versions: All Versions
>Reporter: yx91490
>Priority: Major
>
> It will failed to execute the code snippet:
> {code:java}
> String sql = "select 1 --?";
> try (Connection connection = DriverManager.getConnection(url);
> PreparedStatement stmt = connection.prepareStatement(sql)) {
>   try (ResultSet rs = stmt.executeQuery()) {
>   }
> } {code}
> The error message may be like:
> {code:java}
> Exception in thread "main" java.sql.SQLException: Parameter #1 is unset
>     at 
> org.apache.hive.jdbc.HivePreparedStatement.updateSql(HivePreparedStatement.java:122)
>     at 
> org.apache.hive.jdbc.HivePreparedStatement.executeQuery(HivePreparedStatement.java:100)
>  
> ..{code}
> The cause is HivePreparedStatement.splitSqlStatement(sql) take all the 
> template sql's chars into consideration including the comments which may be 
> contains the question mark unfortunately.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-27392) Iceberg: Use String instead of Long for file length in HadoopInputFile

2023-06-01 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17728330#comment-17728330
 ] 

Ayush Saxena commented on HIVE-27392:
-

Committed to master.

Thanx [~dkuzmenko] and [~ste...@apache.org] for the reviews!!!

> Iceberg: Use String instead of Long for file length in HadoopInputFile
> --
>
> Key: HIVE-27392
> URL: https://issues.apache.org/jira/browse/HIVE-27392
> Project: Hive
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
>
> Apply the workaround mentioned over here:
> https://issues.apache.org/jira/browse/HADOOP-18724?focusedCommentId=17718087=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17718087
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27397) HivePreparedStatement cannot execute SQL with question marks in comments

2023-06-01 Thread yx91490 (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yx91490 updated HIVE-27397:
---
Description: 
It will failed to execute the code snippet:
{code:java}
String sql = "select 1 --?";
try (Connection connection = DriverManager.getConnection(url);
PreparedStatement stmt = connection.prepareStatement(sql)) {
  try (ResultSet rs = stmt.executeQuery()) {
  }
} {code}
The error message may be like:
{code:java}
Exception in thread "main" java.sql.SQLException: Parameter #1 is unset
    at 
org.apache.hive.jdbc.HivePreparedStatement.updateSql(HivePreparedStatement.java:122)
    at 
org.apache.hive.jdbc.HivePreparedStatement.executeQuery(HivePreparedStatement.java:100)
 
..{code}
The cause is HivePreparedStatement.splitSqlStatement(sql) take all the template 
sql's chars into consideration including the comments which may be contains the 
question mark unfortunately.

  was:
It will failed to execute the code snippet:
{code:java}
String sql = "select 1 --?";
try (Connection connection = DriverManager.getConnection(url);
PreparedStatement stmt = connection.prepareStatement(sql)) {
  try (ResultSet rs = stmt.executeQuery()) {
  }
} {code}
The error message may be like:
{code:java}
Exception in thread "main" java.sql.SQLException: Parameter #1 is unset
    at 
org.apache.hive.jdbc.HivePreparedStatement.updateSql(HivePreparedStatement.java:122)
    at 
org.apache.hive.jdbc.HivePreparedStatement.executeQuery(HivePreparedStatement.java:100)
 {code}
The cause is HivePreparedStatement.splitSqlStatement(sql) take all the template 
sql's chars into consideration including the comments which may be contains the 
question mark unfortunately.


> HivePreparedStatement cannot execute SQL with question marks in comments
> 
>
> Key: HIVE-27397
> URL: https://issues.apache.org/jira/browse/HIVE-27397
> Project: Hive
>  Issue Type: Bug
>Affects Versions: All Versions
>Reporter: yx91490
>Priority: Major
>
> It will failed to execute the code snippet:
> {code:java}
> String sql = "select 1 --?";
> try (Connection connection = DriverManager.getConnection(url);
> PreparedStatement stmt = connection.prepareStatement(sql)) {
>   try (ResultSet rs = stmt.executeQuery()) {
>   }
> } {code}
> The error message may be like:
> {code:java}
> Exception in thread "main" java.sql.SQLException: Parameter #1 is unset
>     at 
> org.apache.hive.jdbc.HivePreparedStatement.updateSql(HivePreparedStatement.java:122)
>     at 
> org.apache.hive.jdbc.HivePreparedStatement.executeQuery(HivePreparedStatement.java:100)
>  
> ..{code}
> The cause is HivePreparedStatement.splitSqlStatement(sql) take all the 
> template sql's chars into consideration including the comments which may be 
> contains the question mark unfortunately.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27397) HivePreparedStatement cannot execute SQL with question marks in comments

2023-06-01 Thread yx91490 (Jira)
yx91490 created HIVE-27397:
--

 Summary: HivePreparedStatement cannot execute SQL with question 
marks in comments
 Key: HIVE-27397
 URL: https://issues.apache.org/jira/browse/HIVE-27397
 Project: Hive
  Issue Type: Bug
Affects Versions: All Versions
Reporter: yx91490


It will failed to execute the code snippet:
{code:java}
String sql = "select 1 --?";
try (Connection connection = DriverManager.getConnection(url);
PreparedStatement stmt = connection.prepareStatement(sql)) {
  try (ResultSet rs = stmt.executeQuery()) {
  }
} {code}
The error message may be like:
{code:java}
Exception in thread "main" java.sql.SQLException: Parameter #1 is unset
    at 
org.apache.hive.jdbc.HivePreparedStatement.updateSql(HivePreparedStatement.java:122)
    at 
org.apache.hive.jdbc.HivePreparedStatement.executeQuery(HivePreparedStatement.java:100)
 {code}
The cause is HivePreparedStatement.splitSqlStatement(sql) take all the template 
sql's chars into consideration including the comments which may be contains the 
question mark unfortunately.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27332) Add retry backoff mechanism for abort cleanup

2023-06-01 Thread Sourabh Badhya (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sourabh Badhya updated HIVE-27332:
--
Description: 
HIVE-27019 and HIVE-27020 added the functionality to directly clean data 
directories from aborted transactions without using Initiator & Worker. 
However, during the event of continuous failure during cleanup, the retry 
mechanism is initiated every single time. We need to add retry backoff 
mechanism to control the time required to initiate retry again and not 
continuously retry.

There are widely 3 cases wherein retry due to abort cleanup is impacted - 
*1. Abort cleanup on the table failed + Compaction on the table failed.*
*2. Abort cleanup on the table failed + Compaction on the table passed*
*3. Abort cleanup on the table failed + No compaction on the table.*

*Solution -* 

*We reuse COMPACTION_QUEUE table to store the retry metadata -* 

*Advantage: Most of the fields with respect to retry are present in 
COMPACTION_QUEUE. Hence we can use the same for storing retry metadata. A 
compaction type called ABORT_CLEANUP ('c') is introduced. The CQ_STATE will 
remain ready for cleaning for such records.*

*Actions performed by TaskHandler in the case of failure -* 

*AbortTxnCleaner -* 
Action: Just add retry details in the queue table during the abort failure.
*CompactionCleaner -* 
Action: If compaction on the same table is successful, delete the retry entry 
in markCleaned when removing any TXN_COMPONENTS entries except when there are 
no uncompacted aborts. We do not want to be in a situation where there is a 
queue entry for a table but there is no record in TXN_COMPONENTS associated 
with the same table.

*Advantage: Expecting no performance issues with this approach. Since we delete 
1 record most of the times for the associated table/partition.*

  was:
HIVE-27019 and HIVE-27020 added the functionality to directly clean data 
directories from aborted transactions without using Initiator & Worker. 
However, during the event of continuous failure during cleanup, the retry 
mechanism is initiated every single time. We need to add retry backoff 
mechanism to control the time required to initiate retry again and not 
continuously retry.

There are widely 3 cases wherein retry due to abort cleanup is impacted - 
*1. Abort cleanup on the table failed + Compaction on the table failed.*
*2. Abort cleanup on the table failed + Compaction on the table passed*
*3. Abort cleanup on the table failed + No compaction on the table.*

*Solution -* 

*We create a new table called TXN_CLEANUP_QUEUE with following fields to store 
the retry metadata -* 
CREATE TABLE TXN_CLEANUP_QUEUE (
TCQ_DATABASE varchar(128) NOT NULL, 
TCQ_TABLE varchar(256) NOT NULL,
TCQ_PARTITION varchar(767), 
TCQ_RETRY_RETENTION bigint NOT NULL DEFAULT 0, 
TCQ_ERROR_MESSAGE mediumtext in MySQL / clob in derby, oracle DB / text in 
postgres / varchar(max) in mssql DB

);

*Advantage: Separates the flow of metadata. We also eliminate the chance of 
breaking the compaction/abort cleanup when modifying metadata of abort 
cleanup/compaction. Easier debugging in case of failures.*

*Actions performed by TaskHandler in the case of failure -* 

*AbortTxnCleaner -* 
Action: Just add retry details in the queue table during the abort failure.
*CompactionCleaner -* 
Action: If compaction on the same table is successful, delete the retry entry 
in markCleaned when removing any TXN_COMPONENTS entries except when there are 
no uncompacted aborts. We do not want to be in a situation where there is a 
queue entry for a table but there is no record in TXN_COMPONENTS associated 
with the same table.

*Advantage: Expecting no performance issues with this approach. Since we delete 
1 record most of the times for the associated table/partition.*


> Add retry backoff mechanism for abort cleanup
> -
>
> Key: HIVE-27332
> URL: https://issues.apache.org/jira/browse/HIVE-27332
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> HIVE-27019 and HIVE-27020 added the functionality to directly clean data 
> directories from aborted transactions without using Initiator & Worker. 
> However, during the event of continuous failure during cleanup, the retry 
> mechanism is initiated every single time. We need to add retry backoff 
> mechanism to control the time required to initiate retry again and not 
> continuously retry.
> There are widely 3 cases wherein retry due to abort cleanup is impacted - 
> *1. Abort cleanup on the table failed + Compaction on the table failed.*
> *2. Abort cleanup on the table failed + Compaction on the table passed*
> *3. Abort cleanup on the 

[jira] [Resolved] (HIVE-27366) Incorrect incremental rebuild mode shown of materialized view with Iceberg sources

2023-06-01 Thread Krisztian Kasa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa resolved HIVE-27366.
---
Resolution: Fixed

> Incorrect incremental rebuild mode shown of materialized view with Iceberg 
> sources
> --
>
> Key: HIVE-27366
> URL: https://issues.apache.org/jira/browse/HIVE-27366
> Project: Hive
>  Issue Type: Bug
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>
> {code}
> CREATE TABLE shtb_test1(KEY INT, VALUE STRING) PARTITIONED BY(ds STRING)
> stored by iceberg stored as orc tblproperties ('format-version'='2');
> CREATE MATERIALIZED VIEW shtb_test1_view1 stored by iceberg stored as orc 
> tblproperties ('format-version'='1') AS
> SELECT * FROM shtb_test1 where KEY > 1000 and KEY < 2000;
> SHOW MATERIALIZED VIEWS;
> {code}
> {code}
> # MV Name Rewriting Enabled   Mode
> Incremental rebuild 
> shtb_test1_view1  Yes Manual refresh  
> Available   
> {code}
> It should be 
> {code}
> # MV Name Rewriting Enabled   Mode
> Incremental rebuild 
> shtb_test1_view1  Yes Manual refresh  
> Available in presence of insert operations only
> {code}
> because deleted rows can not be identified in case of Iceberg source tables.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-27366) Incorrect incremental rebuild mode shown of materialized view with Iceberg sources

2023-06-01 Thread Krisztian Kasa (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17728263#comment-17728263
 ] 

Krisztian Kasa commented on HIVE-27366:
---

Merged to master. Thanks [~veghlaci05] for the review.

> Incorrect incremental rebuild mode shown of materialized view with Iceberg 
> sources
> --
>
> Key: HIVE-27366
> URL: https://issues.apache.org/jira/browse/HIVE-27366
> Project: Hive
>  Issue Type: Bug
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>
> {code}
> CREATE TABLE shtb_test1(KEY INT, VALUE STRING) PARTITIONED BY(ds STRING)
> stored by iceberg stored as orc tblproperties ('format-version'='2');
> CREATE MATERIALIZED VIEW shtb_test1_view1 stored by iceberg stored as orc 
> tblproperties ('format-version'='1') AS
> SELECT * FROM shtb_test1 where KEY > 1000 and KEY < 2000;
> SHOW MATERIALIZED VIEWS;
> {code}
> {code}
> # MV Name Rewriting Enabled   Mode
> Incremental rebuild 
> shtb_test1_view1  Yes Manual refresh  
> Available   
> {code}
> It should be 
> {code}
> # MV Name Rewriting Enabled   Mode
> Incremental rebuild 
> shtb_test1_view1  Yes Manual refresh  
> Available in presence of insert operations only
> {code}
> because deleted rows can not be identified in case of Iceberg source tables.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)