[jira] [Updated] (HIVE-27564) Add log for ZooKeeperTokenStore

2023-08-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-27564:
--
Labels: pull-request-available  (was: )

> Add log for ZooKeeperTokenStore
> ---
>
> Key: HIVE-27564
> URL: https://issues.apache.org/jira/browse/HIVE-27564
> Project: Hive
>  Issue Type: Improvement
>Reporter: lvyanquan
>Assignee: lvyanquan
>Priority: Major
>  Labels: pull-request-available
> Attachments: picture 1.png
>
>
> when use zookeeper to store DelegationTokenIdentifier,  we use 
> {code:java}
> TokenStoreDelegationTokenSecretManager.encodeWritable(tokenIdentifier) {code}
> to encode message of tokens and create the path in zookeeper, like 
> picture1.png
> However, when this token was removed, error message will display 
> DelegationTokenIdentifier only, then we have no idea about whether this path 
> is still existed and when the path in zookeeper was created and deleted.
> {code:java}
> public byte[] retrievePassword(DelegationTokenIdentifier identifier) throws 
> InvalidToken {
> DelegationTokenInformation info = this.tokenStore.getToken(identifier);
> if (info == null) {
> throw new InvalidToken("token expired or does not exist: " + 
> identifier);
> }
> // must reuse super as info.getPassword is not accessible
> synchronized (this) {
>   try {
> super.currentTokens.put(identifier, info);
> return super.retrievePassword(identifier);
>   } finally {
> super.currentTokens.remove(identifier);
>   }
> }
> } {code}
> So, I try to add so log about the lifecycle of the tokenPath.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27564) Add log for ZooKeeperTokenStore

2023-08-03 Thread lvyanquan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lvyanquan updated HIVE-27564:
-
Description: 
when use zookeeper to store DelegationTokenIdentifier,  we use 
{code:java}
TokenStoreDelegationTokenSecretManager.encodeWritable(tokenIdentifier) {code}
to encode message of tokens and create the path in zookeeper, like picture1.png

However, when this token was removed, error message will display 
DelegationTokenIdentifier only, then we have no idea about whether this path is 
still existed and when the path in zookeeper was created and deleted.
{code:java}
public byte[] retrievePassword(DelegationTokenIdentifier identifier) throws 
InvalidToken {
DelegationTokenInformation info = this.tokenStore.getToken(identifier);
if (info == null) {
throw new InvalidToken("token expired or does not exist: " + 
identifier);
}
// must reuse super as info.getPassword is not accessible
synchronized (this) {
  try {
super.currentTokens.put(identifier, info);
return super.retrievePassword(identifier);
  } finally {
super.currentTokens.remove(identifier);
  }
}
} {code}
So, I try to add so log about the lifecycle of the tokenPath.

  was:
when use zookeeper to store DelegationTokenIdentifier,  we use 
{code:java}
TokenStoreDelegationTokenSecretManager.encodeWritable(tokenIdentifier) {code}
to encode message of tokens, like picture1.png

However, when this token was removed, error message will display 
DelegationTokenIdentifier only, then we have no idea about whether this path is 
still existed and when the path in zookeeper was created and deleted.

{code:java}
public byte[] retrievePassword(DelegationTokenIdentifier identifier) throws 
InvalidToken {
DelegationTokenInformation info = this.tokenStore.getToken(identifier);
if (info == null) {
throw new InvalidToken("token expired or does not exist: " + 
identifier);
}
// must reuse super as info.getPassword is not accessible
synchronized (this) {
  try {
super.currentTokens.put(identifier, info);
return super.retrievePassword(identifier);
  } finally {
super.currentTokens.remove(identifier);
  }
}
} {code}
So, I try to add so log about the lifecycle of the tokenPath.


> Add log for ZooKeeperTokenStore
> ---
>
> Key: HIVE-27564
> URL: https://issues.apache.org/jira/browse/HIVE-27564
> Project: Hive
>  Issue Type: Improvement
>Reporter: lvyanquan
>Assignee: lvyanquan
>Priority: Major
> Attachments: picture 1.png
>
>
> when use zookeeper to store DelegationTokenIdentifier,  we use 
> {code:java}
> TokenStoreDelegationTokenSecretManager.encodeWritable(tokenIdentifier) {code}
> to encode message of tokens and create the path in zookeeper, like 
> picture1.png
> However, when this token was removed, error message will display 
> DelegationTokenIdentifier only, then we have no idea about whether this path 
> is still existed and when the path in zookeeper was created and deleted.
> {code:java}
> public byte[] retrievePassword(DelegationTokenIdentifier identifier) throws 
> InvalidToken {
> DelegationTokenInformation info = this.tokenStore.getToken(identifier);
> if (info == null) {
> throw new InvalidToken("token expired or does not exist: " + 
> identifier);
> }
> // must reuse super as info.getPassword is not accessible
> synchronized (this) {
>   try {
> super.currentTokens.put(identifier, info);
> return super.retrievePassword(identifier);
>   } finally {
> super.currentTokens.remove(identifier);
>   }
> }
> } {code}
> So, I try to add so log about the lifecycle of the tokenPath.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27564) Add log for ZooKeeperTokenStore

2023-08-03 Thread lvyanquan (Jira)
lvyanquan created HIVE-27564:


 Summary: Add log for ZooKeeperTokenStore
 Key: HIVE-27564
 URL: https://issues.apache.org/jira/browse/HIVE-27564
 Project: Hive
  Issue Type: Improvement
Reporter: lvyanquan
Assignee: lvyanquan
 Attachments: picture 1.png

when use zookeeper to store DelegationTokenIdentifier,  we use 
{code:java}
TokenStoreDelegationTokenSecretManager.encodeWritable(tokenIdentifier) {code}
to encode message of tokens, like picture1.png

However, when this token was removed, error message will display 
DelegationTokenIdentifier only, then we have no idea about whether this path is 
still existed and when the path in zookeeper was created and deleted.

{code:java}
public byte[] retrievePassword(DelegationTokenIdentifier identifier) throws 
InvalidToken {
DelegationTokenInformation info = this.tokenStore.getToken(identifier);
if (info == null) {
throw new InvalidToken("token expired or does not exist: " + 
identifier);
}
// must reuse super as info.getPassword is not accessible
synchronized (this) {
  try {
super.currentTokens.put(identifier, info);
return super.retrievePassword(identifier);
  } finally {
super.currentTokens.remove(identifier);
  }
}
} {code}
So, I try to add so log about the lifecycle of the tokenPath.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-27309) Large number of partitions and small files causes OOM in query coordinator

2023-08-03 Thread Dmitriy Fingerman (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Fingerman reassigned HIVE-27309:


Assignee: Dmitriy Fingerman

> Large number of partitions and small files causes OOM in query coordinator
> --
>
> Key: HIVE-27309
> URL: https://issues.apache.org/jira/browse/HIVE-27309
> Project: Hive
>  Issue Type: Improvement
>  Components: Iceberg integration
>Reporter: Rajesh Balamohan
>Assignee: Dmitriy Fingerman
>Priority: Major
>
>  When large number of nested partitions (with small files) are read, AM bails 
> out with OOM.
> {noformat}
> CREATE EXTERNAL TABLE `store_sales_delete_6`(
>   `ss_sold_time_sk` int,
>   `ss_item_sk` int,
>   `ss_customer_sk` int,
>   `ss_cdemo_sk` int,
>   `ss_hdemo_sk` int,
>   `ss_addr_sk` int,
>   `ss_store_sk` int,
>   `ss_promo_sk` int,
>   `ss_ticket_number` bigint,
>   `ss_quantity` int,
>   `ss_wholesale_cost` decimal(7,2),
>   `ss_list_price` decimal(7,2),
>   `ss_sales_price` decimal(7,2),
>   `ss_ext_discount_amt` decimal(7,2),
>   `ss_ext_sales_price` decimal(7,2),
>   `ss_ext_wholesale_cost` decimal(7,2),
>   `ss_ext_list_price` decimal(7,2),
>   `ss_ext_tax` decimal(7,2),
>   `ss_coupon_amt` decimal(7,2),
>   `ss_net_paid` decimal(7,2),
>   `ss_net_paid_inc_tax` decimal(7,2),
>   `ss_net_profit` decimal(7,2),
>   `ss_sold_date_sk` int)
> PARTITIONED BY SPEC (
> ss_store_sk, ss_promo_sk, ss_sold_date_sk) STORED by iceberg LOCATION 
> 's3a://blah/blah/tablespace/external/hive/blah.db/store_sales_delete_6';
> alter table store_sales_delete_6 set 
> tblproperties('format'='iceberg/parquet');
> alter table store_sales_delete_6 set 
> tblproperties('format-version'='2');insert into store_sales_delete_6 select * 
> from tpcds_1000_update.ssv limit 10;;
> select count(*) from store_sales_delete_6;
> {noformat}
> Now, select count query throws OOM in query AM.  This query generates 100,000 
> splits which are grouped together into 41 splits. But streaming this and 
> sending as events throws OOM.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-27495) NPE when trying to transform using clause to on clause

2023-08-03 Thread Soumyakanti Das (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Soumyakanti Das resolved HIVE-27495.

Resolution: Fixed

> NPE when trying to transform using clause to on clause
> --
>
> Key: HIVE-27495
> URL: https://issues.apache.org/jira/browse/HIVE-27495
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Soumyakanti Das
>Assignee: Soumyakanti Das
>Priority: Major
>  Labels: pull-request-available
>
> To reproduce, run the following:
> {code:java}
> create table test (
>  id int
> );
> select * from test t1
> join test t2 using(id)
> join test t3 using(id); {code}
> This will fail with:
> {code:java}
> 24364 2023-07-10T14:56:59,715 ERROR [3fb8ea2a-392a-440e-87ec-414ddbbdf273 
> Listener at 0.0.0.0/65008] parse.CalcitePlanner: CBO failed, skipping CBO.
> 24365 java.lang.NullPointerException: null
> 24366         at 
> org.apache.hadoop.hive.ql.parse.type.TypeCheckProcFactory$StrExprProcessor.process(TypeCheckProcFactory.java:418)
>  ~[hive-exec-4.0.0-beta-1-SNAPSHOT.jar:4.0.0-beta-1-SNAPSHOT]
> 24367         at 
> org.apache.hadoop.hive.ql.lib.CostLessRuleDispatcher.dispatch(CostLessRuleDispatcher.java:66)
>  ~[hive-exec-4.0.0-beta-1-SNAPSHOT.jar:4.0.0-beta-1-SNAPSHOT]
> 24368         at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
>  ~[hive-exec-4.0.0-beta-1-SNAPSHOT.jar:4.0.0-beta-1-SNAPSHOT]
> 24369         at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
>  ~[hive-exec-4.0.0-beta-1-SNAPSHOT.jar:4.0.0-beta-1-SNAPSHOT]
> 24370         at 
> org.apache.hadoop.hive.ql.lib.ExpressionWalker.walk(ExpressionWalker.java:101)
>  ~[hive-exec-4.0.0-beta-1-SNAPSHOT.jar:4.0.0-beta-1-SNAPSHOT]
> 24371         at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
>  ~[hive-exec-4.0.0-beta-1-SNAPSHOT.jar:4.0.0-beta-1-SNAPSHOT]
> 24372         at 
> org.apache.hadoop.hive.ql.parse.type.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:228)
>  ~[hive-exec-4.0.0-beta-1-SNAPSHOT.jar:4.0.0-beta-1-SNAPSHOT]
> 24373         at 
> org.apache.hadoop.hive.ql.parse.type.RexNodeTypeCheck.genExprNodeJoinCond(RexNodeTypeCheck.java:60)
>  ~[hive-exec-4.0.0-beta-1-SNAPSHOT.jar:4.0.0-beta-1-SNAPSHOT]
> 24374         at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genJoinRelNode(CalcitePlanner.java:2646)
>  ~[hive-exec-4.0.0-beta-1-SNAPSHOT.jar:4.0.0-beta-1-SNAPSHOT]
> 24375         at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genJoinLogicalPlan(CalcitePlanner.java:2878)
>  ~[hive-exec-4.0.0-beta-1-SNAPSHOT.jar:4.0.0-beta-1-SNAPSHOT]
> 24376         at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:5038)
>  ~[hive-exec-4.0.0-beta-1-SNAPSHOT.jar:4.0.0-beta-1-SNAPSHOT]
> 24377         at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1649)
>  ~[hive-exec-4.0.0-beta-1-SNAPSHOT.jar:4.0.0-beta-1-SNAPSHOT]
> 24378         at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1593)
>  ~[hive-exec-4.0.0-beta-1-SNAPSHOT.jar:4.0.0-beta-1-SNAPSHOT]
> 24379         at 
> org.apache.calcite.tools.Frameworks.lambda$withPlanner$0(Frameworks.java:131) 
> ~[hive-exec-4.0.0-beta-1-SNAPSHOT.jar:4.0.0-beta-1-SNAPSHOT]
> 24380         at 
> org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:914)
>  ~[hive-exec-4.0.0-beta-1-SNAPSHOT.jar:4.0.0-beta-1-SNAPSHOT]
> 24381         at 
> org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:180) 
> ~[hive-exec-4.0.0-beta-1-SNAPSHOT.jar:4.0.0-beta-1-SNAPSHOT]
> 24382         at 
> org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:126) 
> ~[hive-exec-4.0.0-beta-1-SNAPSHOT.jar:4.0.0-beta-1-SNAPSHOT]
> 24383         at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1345)
>  ~[hive-exec-4.0.0-beta-1-SNAPSHOT.jar:4.0.0-beta-1-SNAPSHOT]
> 24384         at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:572)
>  ~[hive-exec-4.0.0-beta-1-SNAPSHOT.jar:4.0.0-beta-1-SNAPSHOT]
> 24385         at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12826)
>  ~[hive-exec-4.0.0-beta-1-SNAPSHOT.jar:4.0.0-beta-1-SNAPSHOT]
> 24386         at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:467)
>  ~[hive-exec-4.0.0-beta-1-SNAPSHOT.jar:4.0.0-beta-1-SNAPSHOT]
> 24387         at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327)
>  

[jira] [Updated] (HIVE-27563) Add typeof UDF

2023-08-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-27563:
--
Labels: pull-request-available  (was: )

> Add typeof UDF
> --
>
> Key: HIVE-27563
> URL: https://issues.apache.org/jira/browse/HIVE-27563
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: John Sherman
>Assignee: John Sherman
>Priority: Minor
>  Labels: pull-request-available
>
> It would be useful to have a typeof UDF that returns a string describing the 
> type of the argument.
> For example:
> SELECT typeof(int_column) FROM made_up_table
> would return:
> int
> This is useful for test purposes and debugging purposes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27563) Add typeof UDF

2023-08-03 Thread John Sherman (Jira)
John Sherman created HIVE-27563:
---

 Summary: Add typeof UDF
 Key: HIVE-27563
 URL: https://issues.apache.org/jira/browse/HIVE-27563
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Reporter: John Sherman
Assignee: John Sherman


It would be useful to have a typeof UDF that returns a string describing the 
type of the argument.

For example:

SELECT typeof(int_column) FROM made_up_table
would return:
int

This is useful for test purposes and debugging purposes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27562) Iceberg: Fetching virtual columns failing

2023-08-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-27562:
--
Labels: pull-request-available  (was: )

> Iceberg: Fetching virtual columns failing
> -
>
> Key: HIVE-27562
> URL: https://issues.apache.org/jira/browse/HIVE-27562
> Project: Hive
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
>
> Fetching virtual column fails with
> {noformat}
> Error: Error while compiling statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, 
> vertexName=Map 1, vertexId=vertex_1691064079730_0001_3_00, diagnostics=[Task 
> failed, taskId=task_1691064079730_0001_3_00_00, diagnostics=[TaskAttempt 
> 0 failed, info=[Error: Error while running task ( failure ) : 
> attempt_1691064079730_0001_3_00_00_0:java.lang.RuntimeException: 
> java.lang.RuntimeException: java.io.IOException: 
> java.lang.IndexOutOfBoundsException: start index (4) must not be greater than 
> size (1)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:348)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:276)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:381)
> {noformat}
> Or:
> {noformat}
> Caused by: java.lang.IllegalStateException: Not an instance of 
> org.apache.iceberg.util.StructProjection: 2
>   at org.apache.iceberg.data.GenericRecord.get(GenericRecord.java:123)
>   at 
> org.apache.iceberg.mr.hive.IcebergAcidUtil.computePartitionHash(IcebergAcidUtil.java:192)
>   at 
> org.apache.iceberg.mr.hive.IcebergAcidUtil$VirtualColumnAwareIterator.next(IcebergAcidUtil.java:255)
>   at 
> org.apache.iceberg.mr.mapreduce.IcebergInputFormat$IcebergRecordReader.nextKeyValue(IcebergInputFormat.java:279)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27562) Iceberg: Fetching virtual columns failing

2023-08-03 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HIVE-27562:

Description: 
Fetching virtual column fails with
{noformat}
Error: Error while compiling statement: FAILED: Execution Error, return code 2 
from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 
1, vertexId=vertex_1691064079730_0001_3_00, diagnostics=[Task failed, 
taskId=task_1691064079730_0001_3_00_00, diagnostics=[TaskAttempt 0 failed, 
info=[Error: Error while running task ( failure ) : 
attempt_1691064079730_0001_3_00_00_0:java.lang.RuntimeException: 
java.lang.RuntimeException: java.io.IOException: 
java.lang.IndexOutOfBoundsException: start index (4) must not be greater than 
size (1)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:348)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:276)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:381)
{noformat}

Or:

{noformat}
Caused by: java.lang.IllegalStateException: Not an instance of 
org.apache.iceberg.util.StructProjection: 2
at org.apache.iceberg.data.GenericRecord.get(GenericRecord.java:123)
at 
org.apache.iceberg.mr.hive.IcebergAcidUtil.computePartitionHash(IcebergAcidUtil.java:192)
at 
org.apache.iceberg.mr.hive.IcebergAcidUtil$VirtualColumnAwareIterator.next(IcebergAcidUtil.java:255)
at 
org.apache.iceberg.mr.mapreduce.IcebergInputFormat$IcebergRecordReader.nextKeyValue(IcebergInputFormat.java:279)
{noformat}


  was:
Fetching virtual column fails with
{noformat}
Error: Error while compiling statement: FAILED: Execution Error, return code 2 
from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 
1, vertexId=vertex_1691064079730_0001_3_00, diagnostics=[Task failed, 
taskId=task_1691064079730_0001_3_00_00, diagnostics=[TaskAttempt 0 failed, 
info=[Error: Error while running task ( failure ) : 
attempt_1691064079730_0001_3_00_00_0:java.lang.RuntimeException: 
java.lang.RuntimeException: java.io.IOException: 
java.lang.IndexOutOfBoundsException: start index (4) must not be greater than 
size (1)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:348)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:276)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:381)
{noformat}


> Iceberg: Fetching virtual columns failing
> -
>
> Key: HIVE-27562
> URL: https://issues.apache.org/jira/browse/HIVE-27562
> Project: Hive
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>
> Fetching virtual column fails with
> {noformat}
> Error: Error while compiling statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, 
> vertexName=Map 1, vertexId=vertex_1691064079730_0001_3_00, diagnostics=[Task 
> failed, taskId=task_1691064079730_0001_3_00_00, diagnostics=[TaskAttempt 
> 0 failed, info=[Error: Error while running task ( failure ) : 
> attempt_1691064079730_0001_3_00_00_0:java.lang.RuntimeException: 
> java.lang.RuntimeException: java.io.IOException: 
> java.lang.IndexOutOfBoundsException: start index (4) must not be greater than 
> size (1)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:348)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:276)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:381)
> {noformat}
> Or:
> {noformat}
> Caused by: java.lang.IllegalStateException: Not an instance of 
> org.apache.iceberg.util.StructProjection: 2
>   at org.apache.iceberg.data.GenericRecord.get(GenericRecord.java:123)
>   at 
> org.apache.iceberg.mr.hive.IcebergAcidUtil.computePartitionHash(IcebergAcidUtil.java:192)
>   at 
> org.apache.iceberg.mr.hive.IcebergAcidUtil$VirtualColumnAwareIterator.next(IcebergAcidUtil.java:255)
>   at 
> org.apache.iceberg.mr.mapreduce.IcebergInputFormat$IcebergRecordReader.nextKeyValue(IcebergInputFormat.java:279)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27562) Iceberg: Fetching virtual columns failing

2023-08-03 Thread Ayush Saxena (Jira)
Ayush Saxena created HIVE-27562:
---

 Summary: Iceberg: Fetching virtual columns failing
 Key: HIVE-27562
 URL: https://issues.apache.org/jira/browse/HIVE-27562
 Project: Hive
  Issue Type: Bug
Reporter: Ayush Saxena
Assignee: Ayush Saxena


Fetching virtual column fails with
{noformat}
Error: Error while compiling statement: FAILED: Execution Error, return code 2 
from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 
1, vertexId=vertex_1691064079730_0001_3_00, diagnostics=[Task failed, 
taskId=task_1691064079730_0001_3_00_00, diagnostics=[TaskAttempt 0 failed, 
info=[Error: Error while running task ( failure ) : 
attempt_1691064079730_0001_3_00_00_0:java.lang.RuntimeException: 
java.lang.RuntimeException: java.io.IOException: 
java.lang.IndexOutOfBoundsException: start index (4) must not be greater than 
size (1)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:348)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:276)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:381)
{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-27540) Fix orc_merge10.q test in branch-3

2023-08-03 Thread Sankar Hariappan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan resolved HIVE-27540.
-
Resolution: Fixed

> Fix orc_merge10.q test in branch-3
> --
>
> Key: HIVE-27540
> URL: https://issues.apache.org/jira/browse/HIVE-27540
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Aman Raj
>Assignee: Aman Raj
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.2.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27540) Fix orc_merge10.q test in branch-3

2023-08-03 Thread Sankar Hariappan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-27540:

Summary: Fix orc_merge10.q test in branch-3  (was: Fix orc_merge10.q test)

> Fix orc_merge10.q test in branch-3
> --
>
> Key: HIVE-27540
> URL: https://issues.apache.org/jira/browse/HIVE-27540
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Aman Raj
>Assignee: Aman Raj
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27540) Fix orc_merge10.q test in branch-3

2023-08-03 Thread Sankar Hariappan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-27540:

Fix Version/s: 3.2.0

> Fix orc_merge10.q test in branch-3
> --
>
> Key: HIVE-27540
> URL: https://issues.apache.org/jira/browse/HIVE-27540
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Aman Raj
>Assignee: Aman Raj
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.2.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27283) Use tez.local.mode in HiveServer2 for trivial queries

2023-08-03 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-27283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-27283:

Description: 
Today, a query like this:
{code}
INSERT INTO TABLE students VALUES ('fred flintstone', 35, 1.28), ('barney 
rubble', 32, 2.32);
{code}
spins up a TezAM and containers. I believe this is not optimal, even if we 
already have an tez application running. Not to mention setups where only a 
hiveserver2 is alive and TezAMs + LLAP executors are spun up on demand, e.g. 
Cloudera's Data Warehouse, but I'm assuming other companies might do a similar 
thing in the cloud.

With this optimization a possible risk is to overwhelm Hiveserver2 with such 
queries, this scenario should be handled with care.

My proposal is to maintain a local tez session pool (default size 0, 
recommended is 1...4) in hs2, and let's identify "trivial queries" compile-time 
that currently needs tez application (like the INSERT INTO above).
The first implementation can include only simply INSERT INTO queries, and we 
can decide the rest later.

  was:
Today, a query like this:
{code}
INSERT INTO TABLE students VALUES ('fred flintstone', 35, 1.28), ('barney 
rubble', 32, 2.32);
{code}
spins up a TezAM and containers. I believe this is not optimal, even if we 
already have an tez application running. Not to mention setups where only a 
hiveserver2 is alive and TezAMs + LLAP executors are spun up on demand.

With this optimization a possible risk is to overwhelm Hiveserver2 with such 
queries, this scenario should be handled with care.

My proposal is to maintain a local tez session pool (default size 0, 
recommended is 1...4) in hs2, and let's identify "trivial queries" compile-time 
that currently needs tez application (like the INSERT INTO above).
The first implementation can include only simply INSERT INTO queries, and we 
can decide the rest later.


> Use tez.local.mode in HiveServer2 for trivial queries
> -
>
> Key: HIVE-27283
> URL: https://issues.apache.org/jira/browse/HIVE-27283
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Priority: Major
>
> Today, a query like this:
> {code}
> INSERT INTO TABLE students VALUES ('fred flintstone', 35, 1.28), ('barney 
> rubble', 32, 2.32);
> {code}
> spins up a TezAM and containers. I believe this is not optimal, even if we 
> already have an tez application running. Not to mention setups where only a 
> hiveserver2 is alive and TezAMs + LLAP executors are spun up on demand, e.g. 
> Cloudera's Data Warehouse, but I'm assuming other companies might do a 
> similar thing in the cloud.
> With this optimization a possible risk is to overwhelm Hiveserver2 with such 
> queries, this scenario should be handled with care.
> My proposal is to maintain a local tez session pool (default size 0, 
> recommended is 1...4) in hs2, and let's identify "trivial queries" 
> compile-time that currently needs tez application (like the INSERT INTO 
> above).
> The first implementation can include only simply INSERT INTO queries, and we 
> can decide the rest later.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27561) Backport HIVE-22094 : queries failing with ClassCastException: hive.ql.exec.vector.DecimalColumnVector cannot be cast to hive.ql.exec.vector.Decimal64ColumnVector

2023-08-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-27561:
--
Labels: pull-request-available  (was: )

> Backport HIVE-22094 : queries failing with ClassCastException: 
> hive.ql.exec.vector.DecimalColumnVector cannot be cast to 
> hive.ql.exec.vector.Decimal64ColumnVector
> --
>
> Key: HIVE-27561
> URL: https://issues.apache.org/jira/browse/HIVE-27561
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Diksha
>Assignee: Diksha
>Priority: Major
>  Labels: pull-request-available
>
> backport HIVE-22094 : queries failing with ClassCastException: 
> hive.ql.exec.vector.DecimalColumnVector cannot be cast to 
> hive.ql.exec.vector.Decimal64ColumnVector



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27561) Backport HIVE-22094 : queries failing with ClassCastException: hive.ql.exec.vector.DecimalColumnVector cannot be cast to hive.ql.exec.vector.Decimal64ColumnVector

2023-08-03 Thread Diksha (Jira)
Diksha created HIVE-27561:
-

 Summary: Backport HIVE-22094 : queries failing with 
ClassCastException: hive.ql.exec.vector.DecimalColumnVector cannot be cast to 
hive.ql.exec.vector.Decimal64ColumnVector
 Key: HIVE-27561
 URL: https://issues.apache.org/jira/browse/HIVE-27561
 Project: Hive
  Issue Type: Sub-task
Reporter: Diksha
Assignee: Diksha


backport HIVE-22094 : queries failing with ClassCastException: 
hive.ql.exec.vector.DecimalColumnVector cannot be cast to 
hive.ql.exec.vector.Decimal64ColumnVector



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27560) Enhancing compatibility with Guava

2023-08-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-27560:
--
Labels: pull-request-available  (was: )

> Enhancing compatibility with Guava
> --
>
> Key: HIVE-27560
> URL: https://issues.apache.org/jira/browse/HIVE-27560
> Project: Hive
>  Issue Type: Improvement
>  Components: Build Infrastructure
>Affects Versions: 2.3.10
>Reporter: Yang Jie
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)