[jira] [Commented] (HIVE-27650) Oracle init-db is flaky

2023-08-24 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758748#comment-17758748
 ] 

Ayush Saxena commented on HIVE-27650:
-

Similar to commit: 34b24d55ade393673424f077b69add43bad9f731
which disabled mysql, I plan to disable the oracle docker for now till we 
figure out the reason for the failure, It fails pretty often & leads to 
retrigers and ultimately resource wastage

> Oracle init-db is flaky
> ---
>
> Key: HIVE-27650
> URL: https://issues.apache.org/jira/browse/HIVE-27650
> Project: Hive
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Priority: Major
>
> The oracle docker in hive precommit fails very often
> eg. 
> http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-4578/2/pipeline/462



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27650) Oracle init-db is flaky

2023-08-24 Thread Ayush Saxena (Jira)
Ayush Saxena created HIVE-27650:
---

 Summary: Oracle init-db is flaky
 Key: HIVE-27650
 URL: https://issues.apache.org/jira/browse/HIVE-27650
 Project: Hive
  Issue Type: Bug
Reporter: Ayush Saxena


The oracle docker in hive precommit fails very often
eg. 
http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-4578/2/pipeline/462



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27646) Iceberg: Retry query when concurrent write queries fail due to conflicting writes

2023-08-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-27646:
--
Labels: pull-request-available  (was: )

> Iceberg: Retry query when concurrent write queries fail due to conflicting 
> writes
> -
>
> Key: HIVE-27646
> URL: https://issues.apache.org/jira/browse/HIVE-27646
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri Govindappa
>Assignee: Simhadri Govindappa
>Priority: Major
>  Labels: pull-request-available
>
> Assume two concurrent update queries- Query A and Query B , that have 
> overlapping updates.
> If Query A commits the data and delete files first, then Query B will fail 
> with validation failure due to conflicting writes. 
> In this case, Query B should invalidate the commit files that are already 
> generated and re-execute the full query on the latest snapshot.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27532) Missing semicolon in show create table and show create database output

2023-08-24 Thread Soumyakanti Das (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Soumyakanti Das updated HIVE-27532:
---
Description: 
When we run SHOW CREATE TABLE on a table with constraints, it doesn't add a 
semi-colon to the create table ddl. Here's an output for tpcds table reason:
{code:java}
 CREATE TABLE `reason`(
   `r_reason_sk` int,
   `r_reason_id` string,
   `r_reason_desc` string)
 ROW FORMAT SERDE
   'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
 WITH SERDEPROPERTIES (
   'field.delim'='|',
   'serialization.format'='|')
 STORED AS ORC
 TBLPROPERTIES (
   'transactional'='true',
   'OBJCAPABILITIES'='EXTREAD,EXTWRITE',
   'STATS_GENERATED'='TASK',
   'impala.lastComputeStatsTime'='1674074181',
   'serialization.null.format'='',
   'transient_lastDdlTime'='1674073496')
 ALTER TABLE reason ADD CONSTRAINT 2e47abb2-b6c7-450a-8229-395d6b1ff168 PRIMARY 
KEY (r_reason_sk) DISABLE NOVALIDATE RELY; {code}
 

For completeness, we should also add a semicolon to the SHOW CREATE DATABASE 
output.

  was:
When we run SHOW CREATE TABLE on a table with constraints, it doesn't add a 
semi-colon to the create table ddl. Here's an output for tpcds table reason:


{code:java}
 CREATE TABLE `reason`(
   `r_reason_sk` int,
   `r_reason_id` string,
   `r_reason_desc` string)
 ROW FORMAT SERDE
   'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
 WITH SERDEPROPERTIES (
   'field.delim'='|',
   'serialization.format'='|')
 STORED AS ORC
 TBLPROPERTIES (
   'transactional'='true',
   'OBJCAPABILITIES'='EXTREAD,EXTWRITE',
   'STATS_GENERATED'='TASK',
   'impala.lastComputeStatsTime'='1674074181',
   'serialization.null.format'='',
   'transient_lastDdlTime'='1674073496')
 ALTER TABLE reason ADD CONSTRAINT 2e47abb2-b6c7-450a-8229-395d6b1ff168 PRIMARY 
KEY (r_reason_sk) DISABLE NOVALIDATE RELY; {code}


> Missing semicolon in show create table and show create database output
> --
>
> Key: HIVE-27532
> URL: https://issues.apache.org/jira/browse/HIVE-27532
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Soumyakanti Das
>Assignee: Soumyakanti Das
>Priority: Major
>  Labels: pull-request-available
>
> When we run SHOW CREATE TABLE on a table with constraints, it doesn't add a 
> semi-colon to the create table ddl. Here's an output for tpcds table reason:
> {code:java}
>  CREATE TABLE `reason`(
>    `r_reason_sk` int,
>    `r_reason_id` string,
>    `r_reason_desc` string)
>  ROW FORMAT SERDE
>    'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
>  WITH SERDEPROPERTIES (
>    'field.delim'='|',
>    'serialization.format'='|')
>  STORED AS ORC
>  TBLPROPERTIES (
>    'transactional'='true',
>    'OBJCAPABILITIES'='EXTREAD,EXTWRITE',
>    'STATS_GENERATED'='TASK',
>    'impala.lastComputeStatsTime'='1674074181',
>    'serialization.null.format'='',
>    'transient_lastDdlTime'='1674073496')
>  ALTER TABLE reason ADD CONSTRAINT 2e47abb2-b6c7-450a-8229-395d6b1ff168 
> PRIMARY KEY (r_reason_sk) DISABLE NOVALIDATE RELY; {code}
>  
> For completeness, we should also add a semicolon to the SHOW CREATE DATABASE 
> output.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27532) Missing semicolon in show create table/database output

2023-08-24 Thread Soumyakanti Das (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Soumyakanti Das updated HIVE-27532:
---
Summary: Missing semicolon in show create table/database output  (was: 
Missing semicolon in show create table output)

> Missing semicolon in show create table/database output
> --
>
> Key: HIVE-27532
> URL: https://issues.apache.org/jira/browse/HIVE-27532
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Soumyakanti Das
>Assignee: Soumyakanti Das
>Priority: Major
>  Labels: pull-request-available
>
> When we run SHOW CREATE TABLE on a table with constraints, it doesn't add a 
> semi-colon to the create table ddl. Here's an output for tpcds table reason:
> {code:java}
>  CREATE TABLE `reason`(
>    `r_reason_sk` int,
>    `r_reason_id` string,
>    `r_reason_desc` string)
>  ROW FORMAT SERDE
>    'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
>  WITH SERDEPROPERTIES (
>    'field.delim'='|',
>    'serialization.format'='|')
>  STORED AS ORC
>  TBLPROPERTIES (
>    'transactional'='true',
>    'OBJCAPABILITIES'='EXTREAD,EXTWRITE',
>    'STATS_GENERATED'='TASK',
>    'impala.lastComputeStatsTime'='1674074181',
>    'serialization.null.format'='',
>    'transient_lastDdlTime'='1674073496')
>  ALTER TABLE reason ADD CONSTRAINT 2e47abb2-b6c7-450a-8229-395d6b1ff168 
> PRIMARY KEY (r_reason_sk) DISABLE NOVALIDATE RELY; {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27532) Missing semicolon in show create table and show create database output

2023-08-24 Thread Soumyakanti Das (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Soumyakanti Das updated HIVE-27532:
---
Summary: Missing semicolon in show create table and show create database 
output  (was: Missing semicolon in show create table/database output)

> Missing semicolon in show create table and show create database output
> --
>
> Key: HIVE-27532
> URL: https://issues.apache.org/jira/browse/HIVE-27532
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Soumyakanti Das
>Assignee: Soumyakanti Das
>Priority: Major
>  Labels: pull-request-available
>
> When we run SHOW CREATE TABLE on a table with constraints, it doesn't add a 
> semi-colon to the create table ddl. Here's an output for tpcds table reason:
> {code:java}
>  CREATE TABLE `reason`(
>    `r_reason_sk` int,
>    `r_reason_id` string,
>    `r_reason_desc` string)
>  ROW FORMAT SERDE
>    'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
>  WITH SERDEPROPERTIES (
>    'field.delim'='|',
>    'serialization.format'='|')
>  STORED AS ORC
>  TBLPROPERTIES (
>    'transactional'='true',
>    'OBJCAPABILITIES'='EXTREAD,EXTWRITE',
>    'STATS_GENERATED'='TASK',
>    'impala.lastComputeStatsTime'='1674074181',
>    'serialization.null.format'='',
>    'transient_lastDdlTime'='1674073496')
>  ALTER TABLE reason ADD CONSTRAINT 2e47abb2-b6c7-450a-8229-395d6b1ff168 
> PRIMARY KEY (r_reason_sk) DISABLE NOVALIDATE RELY; {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (HIVE-27649) Subqueries with a set operator do not support order by clauses

2023-08-24 Thread Nicolas Richard (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758701#comment-17758701
 ] 

Nicolas Richard edited comment on HIVE-27649 at 8/24/23 7:08 PM:
-

I did some investigation to figure out what was going on. In 
https://issues.apache.org/jira/browse/HIVE-21980, the grammar [changed a 
bit|https://github.com/apache/hive/commit/0f39030c3d33b11ae9c14ac81e047b44e8695371]
 from:
{code:java}
atomjoinSource
@init { gParent.pushMsg("joinSource", state); }
@after { gParent.popMsg(state); }
    :  tableSource (lateralView^)*
    |  virtualTableSource (lateralView^)*
    |  (subQuerySource) => subQuerySource (lateralView^)*
    |  partitionedTableFunction (lateralView^)*
    |  LPAREN! joinSource RPAREN!
;{code}
to
{code:java}
atomjoinSource
@init { gParent.pushMsg("joinSource", state); }
@after { gParent.popMsg(state); }
    :  tableSource (lateralView^)*
    |  virtualTableSource (lateralView^)*
    |  (LPAREN (KW_WITH|KW_SELECT|KW_MAP|KW_REDUCE|KW_FROM)) => subQuerySource 
(lateralView^)*
    |  (LPAREN LPAREN atomSelectStatement RPAREN setOperator ) => 
subQuerySource (lateralView^)*
    |  partitionedTableFunction (lateralView^)*
    |  LPAREN! joinSource RPAREN! 
;{code}
When the query is parsed, we end up in the subQuerySource rule because 
atomSelectStatement, by definition, cannot contain SORT BY, CLUSTER BY, 
DISTRIBUTE BY or LIMIT clauses. An exception is thrown because subQuerySource 
requires an identifier which is not present nor needed in this particular 
scenario.

 

I tested it locally and changing _atomSelectStatement_ to _selectStatement_ 
solves the issue. However, I still need to validate that it does not have 
side-effects by running the whole test suite.


was (Author: JIRAUSER298135):
I did some investigation to figure out what was going on. In 
https://issues.apache.org/jira/browse/HIVE-21980, the grammar [changed a 
bit|https://github.com/apache/hive/commit/0f39030c3d33b11ae9c14ac81e047b44e8695371]
 from:
{code:java}
atomjoinSource
@init { gParent.pushMsg("joinSource", state); }
@after { gParent.popMsg(state); }
    :
    tableSource (lateralView^)*
    |
    virtualTableSource (lateralView^)*
    |
    (subQuerySource) => subQuerySource (lateralView^)*
    |
    partitionedTableFunction (lateralView^)*
    |
    LPAREN! joinSource RPAREN!
;{code}
to
{code:java}
atomjoinSource
@init { gParent.pushMsg("joinSource", state); }
@after { gParent.popMsg(state); }
    :  tableSource (lateralView^)*
    |  virtualTableSource (lateralView^)*
    |  (LPAREN (KW_WITH|KW_SELECT|KW_MAP|KW_REDUCE|KW_FROM)) => subQuerySource 
(lateralView^)*
    |  (LPAREN LPAREN atomSelectStatement RPAREN setOperator ) => 
subQuerySource (lateralView^)*
    |  partitionedTableFunction (lateralView^)*
    |  LPAREN! joinSource RPAREN! 
;{code}
When the query is parsed, we end up in the subQuerySource rule because 
atomSelectStatement, by definition, cannot contain SORT BY, CLUSTER BY, 
DISTRIBUTE BY or LIMIT clauses. An exception is thrown because subQuerySource 
requires an identifier which is not present nor needed in this particular 
scenario.

 

I tested it locally and changing _atomSelectStatement_ to _selectStatement_ 
solves the issue. However, I still need to validate that it does not have 
side-effects by running the whole test suite.

> Subqueries with a set operator do not support order by clauses
> --
>
> Key: HIVE-27649
> URL: https://issues.apache.org/jira/browse/HIVE-27649
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 3.1.2, 4.0.0
>Reporter: Nicolas Richard
>Priority: Major
>  Labels: pull-request-available
>
> Consider the following query:
> {code:java}
> select key from ((select key from src order by key) union (select key from 
> src))subq {code}
> Up until 3.1.2, Hive would parse this query without any problems. However, if 
> you try it on the latest versions, you'll get the following exception:
> {code:java}
> org.apache.hadoop.hive.ql.parse.ParseException: line 1:60 cannot recognize 
> input near 'union' '(' 'select' in subquery source
>         at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:125)
>         at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:97) {code}
> With the inner exception stack trace being:
> {code:java}
> NoViableAltException(367@[])
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.identifier(HiveParser_IdentifiersParser.java:14006)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.identifier(HiveParser.java:45086)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.subQuerySource(HiveParser_FromClauseParser.java:5411)
>     at 
> 

[jira] [Comment Edited] (HIVE-27649) Subqueries with a set operator do not support order by clauses

2023-08-24 Thread Nicolas Richard (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758701#comment-17758701
 ] 

Nicolas Richard edited comment on HIVE-27649 at 8/24/23 6:57 PM:
-

I did some investigation to figure out what was going on. In 
https://issues.apache.org/jira/browse/HIVE-21980, the grammar [changed a 
bit|https://github.com/apache/hive/commit/0f39030c3d33b11ae9c14ac81e047b44e8695371]
 from:
{code:java}
atomjoinSource
@init { gParent.pushMsg("joinSource", state); }
@after { gParent.popMsg(state); }
    :
    tableSource (lateralView^)*
    |
    virtualTableSource (lateralView^)*
    |
    (subQuerySource) => subQuerySource (lateralView^)*
    |
    partitionedTableFunction (lateralView^)*
    |
    LPAREN! joinSource RPAREN!
;{code}
to
{code:java}
atomjoinSource
@init { gParent.pushMsg("joinSource", state); }
@after { gParent.popMsg(state); }
    :  tableSource (lateralView^)*
    |  virtualTableSource (lateralView^)*
    |  (LPAREN (KW_WITH|KW_SELECT|KW_MAP|KW_REDUCE|KW_FROM)) => subQuerySource 
(lateralView^)*
    |  (LPAREN LPAREN atomSelectStatement RPAREN setOperator ) => 
subQuerySource (lateralView^)*
    |  partitionedTableFunction (lateralView^)*
    |  LPAREN! joinSource RPAREN! 
;{code}
When the query is parsed, we end up in the subQuerySource rule because 
atomSelectStatement, by definition, cannot contain SORT BY, CLUSTER BY, 
DISTRIBUTE BY or LIMIT clauses. An exception is thrown because subQuerySource 
requires an identifier which is not present nor needed in this particular 
scenario.

 

I tested it locally and changing _atomSelectStatement_ to _selectStatement_ 
solves the issue. However, I still need to validate that it does not have 
side-effects by running the whole test suite.


was (Author: JIRAUSER298135):
I did some investigation to figure out what was going on. In 
https://issues.apache.org/jira/browse/HIVE-21980, the grammar [changed a 
bit|https://github.com/apache/hive/commit/0f39030c3d33b11ae9c14ac81e047b44e8695371]
 from:
{code:java}
atomjoinSource
@init { gParent.pushMsg("joinSource", state); }
@after { gParent.popMsg(state); }
    :
    tableSource (lateralView^)*
    |
    virtualTableSource (lateralView^)*
    |
    (subQuerySource) => subQuerySource (lateralView^)*
    |
    partitionedTableFunction (lateralView^)*
    |
    LPAREN! joinSource RPAREN!
;{code}
to
{code:java}
atomjoinSource
@init { gParent.pushMsg("joinSource", state); }
@after { gParent.popMsg(state); }
    :  tableSource (lateralView^)*
    |  virtualTableSource (lateralView^)*
    |  (LPAREN (KW_WITH|KW_SELECT|KW_MAP|KW_REDUCE|KW_FROM)) => subQuerySource 
(lateralView^)*
    |  (LPAREN LPAREN atomSelectStatement RPAREN setOperator ) => 
subQuerySource (lateralView^)*
    |  partitionedTableFunction (lateralView^)*
    |  LPAREN! joinSource RPAREN! 
;{code}
When the query is parsed, we end up in the subQuerySource rule because 
atomSelectStatement, by definition, cannot contain SORT BY, CLUSTER BY, 
DISTRIBUTE BY or LIMIT clauses. An exception is thrown because subQuerySource 
requires an identifier which is not present nor needed in this particular 
scenario.

 

I tested it locally and changing _atomSelectStatement_ to _selectStatement_ 
solves the issue. However, I still need to validate that it does not have 
side-effects by running the whole test suite.

 

 

 

 

> Subqueries with a set operator do not support order by clauses
> --
>
> Key: HIVE-27649
> URL: https://issues.apache.org/jira/browse/HIVE-27649
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 3.1.2, 4.0.0
>Reporter: Nicolas Richard
>Priority: Major
>  Labels: pull-request-available
>
> Consider the following query:
> {code:java}
> select key from ((select key from src order by key) union (select key from 
> src))subq {code}
> Up until 3.1.2, Hive would parse this query without any problems. However, if 
> you try it on the latest versions, you'll get the following exception:
> {code:java}
> org.apache.hadoop.hive.ql.parse.ParseException: line 1:60 cannot recognize 
> input near 'union' '(' 'select' in subquery source
>         at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:125)
>         at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:97) {code}
> With the inner exception stack trace being:
> {code:java}
> NoViableAltException(367@[])
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.identifier(HiveParser_IdentifiersParser.java:14006)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.identifier(HiveParser.java:45086)
>     at 
> 

[jira] [Commented] (HIVE-27649) Subqueries with a set operator do not support order by clauses

2023-08-24 Thread Nicolas Richard (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758705#comment-17758705
 ] 

Nicolas Richard commented on HIVE-27649:


Proposed fix: [https://github.com/apache/hive/pull/4628] (still in Draft)

> Subqueries with a set operator do not support order by clauses
> --
>
> Key: HIVE-27649
> URL: https://issues.apache.org/jira/browse/HIVE-27649
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 3.1.2, 4.0.0
>Reporter: Nicolas Richard
>Priority: Major
>  Labels: pull-request-available
>
> Consider the following query:
> {code:java}
> select key from ((select key from src order by key) union (select key from 
> src))subq {code}
> Up until 3.1.2, Hive would parse this query without any problems. However, if 
> you try it on the latest versions, you'll get the following exception:
> {code:java}
> org.apache.hadoop.hive.ql.parse.ParseException: line 1:60 cannot recognize 
> input near 'union' '(' 'select' in subquery source
>         at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:125)
>         at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:97) {code}
> With the inner exception stack trace being:
> {code:java}
> NoViableAltException(367@[])
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.identifier(HiveParser_IdentifiersParser.java:14006)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.identifier(HiveParser.java:45086)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.subQuerySource(HiveParser_FromClauseParser.java:5411)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:1921)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:2175)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:2110)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:2175)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromSource(HiveParser_FromClauseParser.java:1750)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromClause(HiveParser_FromClauseParser.java:1593)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.fromClause(HiveParser.java:45094)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.atomSelectStatement(HiveParser.java:38538)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:38831)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:38424)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:37686)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:37574)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:2757)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.explainStatement(HiveParser.java:1751)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1614)
>     at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:123)
>     at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:97) 
> {code}
> Note that this behavior also happens if the subquery contains a SORT BY, 
> CLUSTER BY, DISTRIBUTE BY or LIMIT clause.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27649) Subqueries with a set operator do not support order by clauses

2023-08-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-27649:
--
Labels: pull-request-available  (was: )

> Subqueries with a set operator do not support order by clauses
> --
>
> Key: HIVE-27649
> URL: https://issues.apache.org/jira/browse/HIVE-27649
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 3.1.2, 4.0.0
>Reporter: Nicolas Richard
>Priority: Major
>  Labels: pull-request-available
>
> Consider the following query:
> {code:java}
> select key from ((select key from src order by key) union (select key from 
> src))subq {code}
> Up until 3.1.2, Hive would parse this query without any problems. However, if 
> you try it on the latest versions, you'll get the following exception:
> {code:java}
> org.apache.hadoop.hive.ql.parse.ParseException: line 1:60 cannot recognize 
> input near 'union' '(' 'select' in subquery source
>         at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:125)
>         at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:97) {code}
> With the inner exception stack trace being:
> {code:java}
> NoViableAltException(367@[])
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.identifier(HiveParser_IdentifiersParser.java:14006)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.identifier(HiveParser.java:45086)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.subQuerySource(HiveParser_FromClauseParser.java:5411)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:1921)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:2175)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:2110)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:2175)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromSource(HiveParser_FromClauseParser.java:1750)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromClause(HiveParser_FromClauseParser.java:1593)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.fromClause(HiveParser.java:45094)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.atomSelectStatement(HiveParser.java:38538)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:38831)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:38424)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:37686)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:37574)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:2757)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.explainStatement(HiveParser.java:1751)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1614)
>     at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:123)
>     at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:97) 
> {code}
> Note that this behavior also happens if the subquery contains a SORT BY, 
> CLUSTER BY, DISTRIBUTE BY or LIMIT clause.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (HIVE-27649) Subqueries with a set operator do not support order by clauses

2023-08-24 Thread Nicolas Richard (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758701#comment-17758701
 ] 

Nicolas Richard edited comment on HIVE-27649 at 8/24/23 6:55 PM:
-

I did some investigation to figure out what was going on. In 
https://issues.apache.org/jira/browse/HIVE-21980, the grammar [changed a 
bit|https://github.com/apache/hive/commit/0f39030c3d33b11ae9c14ac81e047b44e8695371]
 from:
{code:java}
atomjoinSource
@init { gParent.pushMsg("joinSource", state); }
@after { gParent.popMsg(state); }
    :
    tableSource (lateralView^)*
    |
    virtualTableSource (lateralView^)*
    |
    (subQuerySource) => subQuerySource (lateralView^)*
    |
    partitionedTableFunction (lateralView^)*
    |
    LPAREN! joinSource RPAREN!
;{code}
to
{code:java}
atomjoinSource
@init { gParent.pushMsg("joinSource", state); }
@after { gParent.popMsg(state); }
    :  tableSource (lateralView^)*
    |  virtualTableSource (lateralView^)*
    |  (LPAREN (KW_WITH|KW_SELECT|KW_MAP|KW_REDUCE|KW_FROM)) => subQuerySource 
(lateralView^)*
    |  (LPAREN LPAREN atomSelectStatement RPAREN setOperator ) => 
subQuerySource (lateralView^)*
    |  partitionedTableFunction (lateralView^)*
    |  LPAREN! joinSource RPAREN! 
;{code}
When the query is parsed, we end up in the subQuerySource rule because 
atomSelectStatement, by definition, cannot contain SORT BY, CLUSTER BY, 
DISTRIBUTE BY or LIMIT clauses. An exception is thrown because subQuerySource 
requires an identifier which is not present nor needed in this particular 
scenario.

 

I tested it locally and changing _atomSelectStatement_ to _selectStatement_ 
solves the issue. However, I still need to validate that it does not have 
side-effects by running the whole test suite.

 

 

 

 


was (Author: JIRAUSER298135):
I did some investigation to figure out what was going on. In 
https://issues.apache.org/jira/browse/HIVE-21980, the grammar [changed a 
bit|https://github.com/apache/hive/commit/0f39030c3d33b11ae9c14ac81e047b44e8695371]
 from:

 
{code:java}
atomjoinSource
@init { gParent.pushMsg("joinSource", state); }
@after { gParent.popMsg(state); }
    :
    tableSource (lateralView^)*
    |
    virtualTableSource (lateralView^)*
    |
    (subQuerySource) => subQuerySource (lateralView^)*
    |
    partitionedTableFunction (lateralView^)*
    |
    LPAREN! joinSource RPAREN!
;{code}
to

 
{code:java}
atomjoinSource
@init { gParent.pushMsg("joinSource", state); }
@after { gParent.popMsg(state); }
    :  tableSource (lateralView^)*
    |  virtualTableSource (lateralView^)*
    |  (LPAREN (KW_WITH|KW_SELECT|KW_MAP|KW_REDUCE|KW_FROM)) => subQuerySource 
(lateralView^)*
    |  (LPAREN LPAREN atomSelectStatement RPAREN setOperator ) => 
subQuerySource (lateralView^)*
    |  partitionedTableFunction (lateralView^)*
    |  LPAREN! joinSource RPAREN! 
;{code}
 

 

When the query is parsed, we end up in the subQuerySource rule because 
atomSelectStatement, by definition, cannot contain SORT BY, CLUSTER BY, 
DISTRIBUTE BY or LIMIT clauses. An exception is thrown because subQuerySource 
requires an identifier which is not present nor needed in this particular 
scenario.

 

I tested it locally and changing _atomSelectStatement_ to _selectStatement_ 
solves the issue. However, I still need to validate that it does not have 
side-effects by running the whole test suite.

 

 

 

 

> Subqueries with a set operator do not support order by clauses
> --
>
> Key: HIVE-27649
> URL: https://issues.apache.org/jira/browse/HIVE-27649
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 3.1.2, 4.0.0
>Reporter: Nicolas Richard
>Priority: Major
>
> Consider the following query:
> {code:java}
> select key from ((select key from src order by key) union (select key from 
> src))subq {code}
> Up until 3.1.2, Hive would parse this query without any problems. However, if 
> you try it on the latest versions, you'll get the following exception:
> {code:java}
> org.apache.hadoop.hive.ql.parse.ParseException: line 1:60 cannot recognize 
> input near 'union' '(' 'select' in subquery source
>         at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:125)
>         at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:97) {code}
> With the inner exception stack trace being:
> {code:java}
> NoViableAltException(367@[])
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.identifier(HiveParser_IdentifiersParser.java:14006)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.identifier(HiveParser.java:45086)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.subQuerySource(HiveParser_FromClauseParser.java:5411)
>     at 
> 

[jira] [Commented] (HIVE-27649) Subqueries with a set operator do not support order by clauses

2023-08-24 Thread Nicolas Richard (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758701#comment-17758701
 ] 

Nicolas Richard commented on HIVE-27649:


I did some investigation to figure out what was going on. In 
https://issues.apache.org/jira/browse/HIVE-21980, the grammar [changed a 
bit|https://github.com/apache/hive/commit/0f39030c3d33b11ae9c14ac81e047b44e8695371]
 from:

 
{code:java}
atomjoinSource
@init { gParent.pushMsg("joinSource", state); }
@after { gParent.popMsg(state); }
    :
    tableSource (lateralView^)*
    |
    virtualTableSource (lateralView^)*
    |
    (subQuerySource) => subQuerySource (lateralView^)*
    |
    partitionedTableFunction (lateralView^)*
    |
    LPAREN! joinSource RPAREN!
;{code}
to

 
{code:java}
atomjoinSource
@init { gParent.pushMsg("joinSource", state); }
@after { gParent.popMsg(state); }
    :  tableSource (lateralView^)*
    |  virtualTableSource (lateralView^)*
    |  (LPAREN (KW_WITH|KW_SELECT|KW_MAP|KW_REDUCE|KW_FROM)) => subQuerySource 
(lateralView^)*
    |  (LPAREN LPAREN atomSelectStatement RPAREN setOperator ) => 
subQuerySource (lateralView^)*
    |  partitionedTableFunction (lateralView^)*
    |  LPAREN! joinSource RPAREN! 
;{code}
 

 

When the query is parsed, we end up in the subQuerySource rule because 
atomSelectStatement, by definition, cannot contain SORT BY, CLUSTER BY, 
DISTRIBUTE BY or LIMIT clauses. An exception is thrown because subQuerySource 
requires an identifier which is not present nor needed in this particular 
scenario.

 

I tested it locally and changing _atomSelectStatement_ to _selectStatement_ 
solves the issue. However, I still need to validate that it does not have 
side-effects by running the whole test suite.

 

 

 

 

> Subqueries with a set operator do not support order by clauses
> --
>
> Key: HIVE-27649
> URL: https://issues.apache.org/jira/browse/HIVE-27649
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 3.1.2, 4.0.0
>Reporter: Nicolas Richard
>Priority: Major
>
> Consider the following query:
> {code:java}
> select key from ((select key from src order by key) union (select key from 
> src))subq {code}
> Up until 3.1.2, Hive would parse this query without any problems. However, if 
> you try it on the latest versions, you'll get the following exception:
> {code:java}
> org.apache.hadoop.hive.ql.parse.ParseException: line 1:60 cannot recognize 
> input near 'union' '(' 'select' in subquery source
>         at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:125)
>         at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:97) {code}
> With the inner exception stack trace being:
> {code:java}
> NoViableAltException(367@[])
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.identifier(HiveParser_IdentifiersParser.java:14006)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.identifier(HiveParser.java:45086)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.subQuerySource(HiveParser_FromClauseParser.java:5411)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:1921)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:2175)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:2110)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:2175)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromSource(HiveParser_FromClauseParser.java:1750)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromClause(HiveParser_FromClauseParser.java:1593)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.fromClause(HiveParser.java:45094)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.atomSelectStatement(HiveParser.java:38538)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:38831)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:38424)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:37686)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:37574)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:2757)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.explainStatement(HiveParser.java:1751)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1614)
>     at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:123)
>    

[jira] [Updated] (HIVE-27649) Subqueries with a set operator do not support order by clauses

2023-08-24 Thread Nicolas Richard (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicolas Richard updated HIVE-27649:
---
Description: 
Consider the following query:
{code:java}
select key from ((select key from src order by key) union (select key from 
src))subq {code}
Up until 3.1.2, Hive would parse this query without any problems. However, if 
you try it on the latest versions, you'll get the following exception:
{code:java}
org.apache.hadoop.hive.ql.parse.ParseException: line 1:60 cannot recognize 
input near 'union' '(' 'select' in subquery source
        at 
org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:125)
        at 
org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:97) {code}
With the inner exception stack trace being:
{code:java}
NoViableAltException(367@[])
    at 
org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.identifier(HiveParser_IdentifiersParser.java:14006)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser.identifier(HiveParser.java:45086)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.subQuerySource(HiveParser_FromClauseParser.java:5411)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:1921)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:2175)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:2110)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:2175)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromSource(HiveParser_FromClauseParser.java:1750)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromClause(HiveParser_FromClauseParser.java:1593)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser.fromClause(HiveParser.java:45094)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser.atomSelectStatement(HiveParser.java:38538)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:38831)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:38424)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:37686)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:37574)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:2757)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser.explainStatement(HiveParser.java:1751)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1614)
    at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:123)
    at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:97) 
{code}
Note that this behavior also happens if the subquery contains a SORT BY, 
CLUSTER BY, DISTRIBUTE BY or LIMIT clause.

 

  was:
Consider the following query:
{code:java}
select key from ((select key from src order by key) union (select key from 
src))subq {code}
Up until 3.1.2, Hive would parse this query without any problems. However, if 
you try it on the latest versions, you'll get the following exception:
{code:java}
org.apache.hadoop.hive.ql.parse.ParseException: line 1:60 cannot recognize 
input near 'union' '(' 'select' in subquery source
        at 
org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:125)
        at 
org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:97) {code}
With the inner exception stack trace being:
{code:java}
NoViableAltException(367@[])
    at 
org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.identifier(HiveParser_IdentifiersParser.java:14006)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser.identifier(HiveParser.java:45086)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.subQuerySource(HiveParser_FromClauseParser.java:5411)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:1921)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:2175)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:2110)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:2175)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromSource(HiveParser_FromClauseParser.java:1750)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromClause(HiveParser_FromClauseParser.java:1593)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser.fromClause(HiveParser.java:45094)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser.atomSelectStatement(HiveParser.java:38538)
    at 

[jira] [Updated] (HIVE-27649) Subqueries with a set operator do not support order by statements

2023-08-24 Thread Nicolas Richard (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicolas Richard updated HIVE-27649:
---
Description: 
Consider the following query:
{code:java}
select key from ((select key from src order by key) union (select key from 
src))subq {code}
Up until 3.1.2, Hive would parse this query without any problems. However, if 
you try it on the latest versions, you'll get the following exception:
{code:java}
org.apache.hadoop.hive.ql.parse.ParseException: line 1:60 cannot recognize 
input near 'union' '(' 'select' in subquery source
        at 
org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:125)
        at 
org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:97) {code}
With the inner exception stack trace being:
{code:java}
NoViableAltException(367@[])
    at 
org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.identifier(HiveParser_IdentifiersParser.java:14006)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser.identifier(HiveParser.java:45086)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.subQuerySource(HiveParser_FromClauseParser.java:5411)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:1921)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:2175)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:2110)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:2175)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromSource(HiveParser_FromClauseParser.java:1750)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromClause(HiveParser_FromClauseParser.java:1593)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser.fromClause(HiveParser.java:45094)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser.atomSelectStatement(HiveParser.java:38538)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:38831)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:38424)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:37686)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:37574)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:2757)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser.explainStatement(HiveParser.java:1751)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1614)
    at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:123)
    at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:97) 
{code}
Note that this behavior also happens for ORDER BY, SORT BY, CLUSTER BY, 
DISTRIBUTE BY or LIMIT

 

  was:
Consider the following query:

 
{code:java}
select key from ((select key from src order by key) union (select key from 
src))subq {code}
Up until 3.1.2, Hive would parse this query without any problems. However, if 
you try it on the latest versions, you'll get the following exception:

 
{code:java}
org.apache.hadoop.hive.ql.parse.ParseException: line 1:60 cannot recognize 
input near 'union' '(' 'select' in subquery source
        at 
org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:125)
        at 
org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:97) {code}
With the inner exception stack trace being:
{code:java}
NoViableAltException(367@[])
    at 
org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.identifier(HiveParser_IdentifiersParser.java:14006)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser.identifier(HiveParser.java:45086)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.subQuerySource(HiveParser_FromClauseParser.java:5411)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:1921)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:2175)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:2110)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:2175)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromSource(HiveParser_FromClauseParser.java:1750)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromClause(HiveParser_FromClauseParser.java:1593)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser.fromClause(HiveParser.java:45094)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser.atomSelectStatement(HiveParser.java:38538)
    at 

[jira] [Updated] (HIVE-27649) Subqueries with a set operator do not support order by clauses

2023-08-24 Thread Nicolas Richard (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicolas Richard updated HIVE-27649:
---
Summary: Subqueries with a set operator do not support order by clauses  
(was: Subqueries with a set operator do not support order by statements)

> Subqueries with a set operator do not support order by clauses
> --
>
> Key: HIVE-27649
> URL: https://issues.apache.org/jira/browse/HIVE-27649
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 3.1.2, 4.0.0
>Reporter: Nicolas Richard
>Priority: Major
>
> Consider the following query:
> {code:java}
> select key from ((select key from src order by key) union (select key from 
> src))subq {code}
> Up until 3.1.2, Hive would parse this query without any problems. However, if 
> you try it on the latest versions, you'll get the following exception:
> {code:java}
> org.apache.hadoop.hive.ql.parse.ParseException: line 1:60 cannot recognize 
> input near 'union' '(' 'select' in subquery source
>         at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:125)
>         at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:97) {code}
> With the inner exception stack trace being:
> {code:java}
> NoViableAltException(367@[])
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.identifier(HiveParser_IdentifiersParser.java:14006)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.identifier(HiveParser.java:45086)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.subQuerySource(HiveParser_FromClauseParser.java:5411)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:1921)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:2175)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:2110)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:2175)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromSource(HiveParser_FromClauseParser.java:1750)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromClause(HiveParser_FromClauseParser.java:1593)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.fromClause(HiveParser.java:45094)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.atomSelectStatement(HiveParser.java:38538)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:38831)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:38424)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:37686)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:37574)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:2757)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.explainStatement(HiveParser.java:1751)
>     at 
> org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1614)
>     at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:123)
>     at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:97) 
> {code}
> Note that this behavior also happens for ORDER BY, SORT BY, CLUSTER BY, 
> DISTRIBUTE BY or LIMIT
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27649) Subqueries with a set operator do not support order by statements

2023-08-24 Thread Nicolas Richard (Jira)
Nicolas Richard created HIVE-27649:
--

 Summary: Subqueries with a set operator do not support order by 
statements
 Key: HIVE-27649
 URL: https://issues.apache.org/jira/browse/HIVE-27649
 Project: Hive
  Issue Type: Bug
  Components: Parser
Affects Versions: 3.1.2, 4.0.0
Reporter: Nicolas Richard


Consider the following query:

 
{code:java}
select key from ((select key from src order by key) union (select key from 
src))subq {code}
Up until 3.1.2, Hive would parse this query without any problems. However, if 
you try it on the latest versions, you'll get the following exception:

 
{code:java}
org.apache.hadoop.hive.ql.parse.ParseException: line 1:60 cannot recognize 
input near 'union' '(' 'select' in subquery source
        at 
org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:125)
        at 
org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:97) {code}
With the inner exception stack trace being:
{code:java}
NoViableAltException(367@[])
    at 
org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.identifier(HiveParser_IdentifiersParser.java:14006)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser.identifier(HiveParser.java:45086)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.subQuerySource(HiveParser_FromClauseParser.java:5411)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:1921)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:2175)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:2110)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:2175)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromSource(HiveParser_FromClauseParser.java:1750)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromClause(HiveParser_FromClauseParser.java:1593)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser.fromClause(HiveParser.java:45094)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser.atomSelectStatement(HiveParser.java:38538)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:38831)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:38424)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:37686)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:37574)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:2757)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser.explainStatement(HiveParser.java:1751)
    at 
org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1614)
    at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:123)
    at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:97) 
{code}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27648) CREATE TABLE with CHECK constraint fails with SemanticException

2023-08-24 Thread Soumyakanti Das (Jira)
Soumyakanti Das created HIVE-27648:
--

 Summary: CREATE TABLE with CHECK constraint fails with 
SemanticException
 Key: HIVE-27648
 URL: https://issues.apache.org/jira/browse/HIVE-27648
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Soumyakanti Das


When we run:
{code:java}
create table test (
col1 int,
`col 2` int check (`col 2` > 10) enable novalidate rely,
constraint check_constraint check (col1 + `col 2` > 15) enable novalidate 
rely
);
{code}
It fails with:

 
{code:java}
 org.apache.hadoop.hive.ql.parse.SemanticException: Invalid Constraint syntax 
Invalid CHECK constraint expression: col 2 > 10.
    at 
org.apache.hadoop.hive.ql.ddl.table.constraint.ConstraintsUtils.validateCheckConstraint(ConstraintsUtils.java:462)
    at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:13839)
    at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genResolvedParseTree(SemanticAnalyzer.java:12618)
    at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12787)
    at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:467)
    at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327)
    at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:224)
    at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:107)
    at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:519)
    at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:471)
    at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:436)
    at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:430)
    at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:121)
    at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:227)
    at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:257)
    at org.apache.hadoop.hive.cli.CliDriver.processCmd1(CliDriver.java:201)
    at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:127)
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:425)
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:356)
    at 
org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:733)
    at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:703)
    at 
org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:115)
    at 
org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:157)
    at 
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver(TestMiniLlapLocalCliDriver.java:62)
 {code}
 

I noticed while debugging that the check constraint expression in 
[cc.getCheck_expression()|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/ddl/table/constraint/ConstraintsUtils.java#L446]
 doesn't include the backticks (`), and this results in wrong token generation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work started] (HIVE-27640) Counter for query concurrency

2023-08-24 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-27640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-27640 started by László Bodor.
---
> Counter for query concurrency
> -
>
> Key: HIVE-27640
> URL: https://issues.apache.org/jira/browse/HIVE-27640
> Project: Hive
>  Issue Type: Sub-task
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>
> This is kind of hard to catch easily, but I would like to see 
> something/anything about query concurrency in the query counters. This way we 
> can instantly see in the query summary what happened. I mean counters like:
> 1. how many queries were running when this query arrived
> 2. same as 1) but in query stage level
> 2a) how many queries were being compiled (or waiting for compilation) when 
> this query started to compile (or started to enqueued for compilation)
> 2b) how many queries were waiting for a coordinator when this query started 
> to get a coordinator
> 2c) how many queries were in the Run DAG phase, when this query started to 
> run DAG



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27647) NullPointerException from LowLevelCacheImpl

2023-08-24 Thread Kwangwon (Trey) Yi (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kwangwon (Trey) Yi updated HIVE-27647:
--
Description: 
Hi all,

 

I've executed Hive using Tez engine, and got the below NPE.

It seems like there was a NullPointerException error from LLAP 
LowLevelCacheImpl class when `putFileData` was called.

What are some possible reasons for this error and suggestions to mitigate it?

 

I wasn't able to find the exact same issue from the past Jira tickets.

 

It would be much appreciated if you could provide me some sort of advice.

Thank you.

 

 
{quote}2023-08-24 10:22:31,122 [INFO] [Dispatcher thread Unknown macro: Unknown 
macro: Unknown macro: \{Central}] |HistoryEventHandler.criticalEvents|: 
[HISTORY][DAG:dag_1691565142260_0112_1][Event:TASK_ATTEMPT_FINISHED]: 
vertexName=Map 4, taskAttemptId=attempt_1691565142260_0112_1_00_10_2, 
creationTime=1691962230725, allocationTime=1691962230727, 
startTime=1691962230728, finishTime=1691962230991, timeTaken=263, 
status=FAILED, taskFailureType=NON_FATAL, errorEnum=FRAMEWORK_ERROR, 
diagnostics=Error: Error while running task ( failure ) : 
attempt_1691565142260_0112_1_00_10_2:java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: 
java.lang.NullPointerException
    at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:303)
    at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:254)
    at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
    at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
    at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
    at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
    at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
    at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
    at 
org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:750)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.io.IOException: java.lang.NullPointerException
    at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:80)
    at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
    at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:272)
    ... 15 more
Caused by: java.io.IOException: java.lang.NullPointerException
    at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
    at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
    at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:376)
    at 
org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:82)
    at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:119)
    at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:59)
    at 
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151)
    at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116)
    at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
    ... 17 more
Caused by: java.lang.NullPointerException
    at 
org.apache.hadoop.hive.llap.cache.LowLevelCacheImpl.putFileData(LowLevelCacheImpl.java:300)
    at 
org.apache.hadoop.hive.llap.io.api.impl.LlapIoImpl$GenericDataCache.putFileData(LlapIoImpl.java:303)
    at 
org.apache.hadoop.hive.llap.LlapCacheAwareFs$CacheAwareInputStream.read(LlapCacheAwareFs.java:324)
    at org.apache.commons.io.IOUtils.read(IOUtils.java:1542)
    at org.apache.commons.io.IOUtils.readFully(IOUtils.java:1658)
    at 
org.apache.hadoop.util.ByteBufferIOUtils.readFullyHeapBuffer(ByteBufferIOUtils.java:89)
    at 
org.apache.hadoop.util.ByteBufferIOUtils.readFully(ByteBufferIOUtils.java:53)
    at 

[jira] [Updated] (HIVE-27647) NullPointerException from LowLevelCacheImpl

2023-08-24 Thread Kwangwon (Trey) Yi (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kwangwon (Trey) Yi updated HIVE-27647:
--
Description: 
Hi all,

 

I've executed Hive using Tez engine, and got the below NPE.

It seems like there was a NullPointerException error from LLAP 
LowLevelCacheImpl class when `putFileData` was called.

What are some possible reasons for this error and suggestions to mitigate it?

 

I wasn't able to find the exact same issue from the past Jira tickets.

 

It would be much appreciated if you could provide me some sort of advice.

Thank you.
{quote}2023-08-24 10:22:31,122 [INFO] [Dispatcher thread Unknown macro: Unknown 
macro: Unknown macro: Unknown macro: \{Central}] 
|HistoryEventHandler.criticalEvents|: 
[HISTORY][DAG:dag_1691565142260_0112_1][Event:TASK_ATTEMPT_FINISHED]: 
vertexName=Map 4, taskAttemptId=attempt_1691565142260_0112_1_00_10_2, 
creationTime=1691962230725, allocationTime=1691962230727, 
startTime=1691962230728, finishTime=1691962230991, timeTaken=263, 
status=FAILED, taskFailureType=NON_FATAL, errorEnum=FRAMEWORK_ERROR, 
diagnostics=Error: Error while running task ( failure ) : 
attempt_1691565142260_0112_1_00_10_2:java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: 
java.lang.NullPointerException
    at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:303)
    at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:254)
    at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
    at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
    at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
    at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
    at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
    at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
    at 
org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:750)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.io.IOException: java.lang.NullPointerException
    at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:80)
    at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
    at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:272)
    ... 15 more
Caused by: java.io.IOException: java.lang.NullPointerException
    at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
    at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
    at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:376)
    at 
org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:82)
    at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:119)
    at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:59)
    at 
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151)
    at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116)
    at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
    ... 17 more
Caused by: java.lang.NullPointerException
    at 
org.apache.hadoop.hive.llap.cache.LowLevelCacheImpl.putFileData(LowLevelCacheImpl.java:300)
    at 
org.apache.hadoop.hive.llap.io.api.impl.LlapIoImpl$GenericDataCache.putFileData(LlapIoImpl.java:303)
    at 
org.apache.hadoop.hive.llap.LlapCacheAwareFs$CacheAwareInputStream.read(LlapCacheAwareFs.java:324)
    at org.apache.commons.io.IOUtils.read(IOUtils.java:1542)
    at org.apache.commons.io.IOUtils.readFully(IOUtils.java:1658)
    at 
org.apache.hadoop.util.ByteBufferIOUtils.readFullyHeapBuffer(ByteBufferIOUtils.java:89)
    at 
org.apache.hadoop.util.ByteBufferIOUtils.readFully(ByteBufferIOUtils.java:53)
    at 

[jira] [Updated] (HIVE-27647) NullPointerException from LowLevelCacheImpl

2023-08-24 Thread Kwangwon (Trey) Yi (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kwangwon (Trey) Yi updated HIVE-27647:
--
Description: 
Hi all,

 

I've executed Hive using Tez engine, and got the below NPE.

It seems like there was a NullPointerException error from LLAP 
LowLevelCacheImpl class when `putFileData` was called.

What are some possible reasons for this error and suggestions to mitigate it?

 

I wasn't able to find the exact same issue from the past Jira tickets.

 

It would be much appreciated if you could provide me some sort of advice.

Thank you.
{quote}2023-08-24 10:22:31,122 [INFO] [Dispatcher thread Unknown macro: Unknown 
macro: Unknown macro: \\{Central}] |HistoryEventHandler.criticalEvents|: 
[HISTORY][DAG:dag_1691565142260_0112_1][Event:TASK_ATTEMPT_FINISHED]: 
vertexName=Map 4, taskAttemptId=attempt_1691565142260_0112_1_00_10_2, 
creationTime=1691962230725, allocationTime=1691962230727, 
startTime=1691962230728, finishTime=1691962230991, timeTaken=263, 
status=FAILED, taskFailureType=NON_FATAL, errorEnum=FRAMEWORK_ERROR, 
diagnostics=Error: Error while running task ( failure ) : 
attempt_1691565142260_0112_1_00_10_2:java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: 
java.lang.NullPointerException
    at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:303)
    at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:254)
    at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
    at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
    at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
    at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
    at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
    at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
    at 
org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:750)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.io.IOException: java.lang.NullPointerException
    at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:80)
    at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
    at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:272)
    ... 15 more
Caused by: java.io.IOException: java.lang.NullPointerException
    at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
    at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
    at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:376)
    at 
org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:82)
    at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:119)
    at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:59)
    at 
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151)
    at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116)
    at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
    ... 17 more
Caused by: java.lang.NullPointerException
    at 
org.apache.hadoop.hive.llap.cache.LowLevelCacheImpl.putFileData(LowLevelCacheImpl.java:300)
    at 
org.apache.hadoop.hive.llap.io.api.impl.LlapIoImpl$GenericDataCache.putFileData(LlapIoImpl.java:303)
    at 
org.apache.hadoop.hive.llap.LlapCacheAwareFs$CacheAwareInputStream.read(LlapCacheAwareFs.java:324)
    at org.apache.commons.io.IOUtils.read(IOUtils.java:1542)
    at org.apache.commons.io.IOUtils.readFully(IOUtils.java:1658)
    at 
org.apache.hadoop.util.ByteBufferIOUtils.readFullyHeapBuffer(ByteBufferIOUtils.java:89)
    at 
org.apache.hadoop.util.ByteBufferIOUtils.readFully(ByteBufferIOUtils.java:53)
    at 

[jira] [Updated] (HIVE-27647) NullPointerException from LowLevelCacheImpl

2023-08-24 Thread Kwangwon (Trey) Yi (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kwangwon (Trey) Yi updated HIVE-27647:
--
Description: 
Hi all,

 

I've executed Hive using Tez engine, and got the below NPE.

It seems like there was a NullPointerException error from LLAP 
LowLevelCacheImpl class when `putFileData` was called.

What are some possible reasons for this error and suggestions to mitigate it?

 

I wasn't able to find the exact same issue from the past Jira tickets.

 

It would be much appreciated if you could provide me some sort of advice.

Thank you.

 

 
{quote}2023-08-24 10:22:31,122 [INFO] [Dispatcher thread Unknown macro: Unknown 
macro: \\{Central}] |HistoryEventHandler.criticalEvents|: 
[HISTORY][DAG:dag_1691565142260_0112_1][Event:TASK_ATTEMPT_FINISHED]: 
vertexName=Map 4, taskAttemptId=attempt_1691565142260_0112_1_00_10_2, 
creationTime=1691962230725, allocationTime=1691962230727, 
startTime=1691962230728, finishTime=1691962230991, timeTaken=263, 
status=FAILED, taskFailureType=NON_FATAL, errorEnum=FRAMEWORK_ERROR, 
diagnostics=Error: Error while running task ( failure ) : 
attempt_1691565142260_0112_1_00_10_2:java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: 
java.lang.NullPointerException
    at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:303)
    at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:254)
    at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
    at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
    at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
    at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
    at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
    at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
    at 
org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:750)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.io.IOException: java.lang.NullPointerException
    at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:80)
    at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
    at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:272)
    ... 15 more
Caused by: java.io.IOException: java.lang.NullPointerException
    at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
    at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
    at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:376)
    at 
org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:82)
    at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:119)
    at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:59)
    at 
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151)
    at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116)
    at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
    ... 17 more
Caused by: java.lang.NullPointerException
    at 
org.apache.hadoop.hive.llap.cache.LowLevelCacheImpl.putFileData(LowLevelCacheImpl.java:300)
    at 
org.apache.hadoop.hive.llap.io.api.impl.LlapIoImpl$GenericDataCache.putFileData(LlapIoImpl.java:303)
    at 
org.apache.hadoop.hive.llap.LlapCacheAwareFs$CacheAwareInputStream.read(LlapCacheAwareFs.java:324)
    at org.apache.commons.io.IOUtils.read(IOUtils.java:1542)
    at org.apache.commons.io.IOUtils.readFully(IOUtils.java:1658)
    at 
org.apache.hadoop.util.ByteBufferIOUtils.readFullyHeapBuffer(ByteBufferIOUtils.java:89)
    at 
org.apache.hadoop.util.ByteBufferIOUtils.readFully(ByteBufferIOUtils.java:53)
    at 

[jira] [Updated] (HIVE-27647) NullPointerException from LowLevelCacheImpl

2023-08-24 Thread Kwangwon (Trey) Yi (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kwangwon (Trey) Yi updated HIVE-27647:
--
Description: 
Hi all,

 

I've executed Hive using Tez engine, and got the below NPE.

It seems like there was a NullPointerException error from LLAP 
LowLevelCacheImpl class when `putFileData` was called.

What are some possible reasons for this error and suggestions to mitigate it?

 

It would be much appreciated if you could provide me some sort of advice.

Thank you.

 

 
{quote}2023-08-24 10:22:31,122 [INFO] [Dispatcher thread Unknown macro: Unknown 
macro: \{Central}] |HistoryEventHandler.criticalEvents|: 
[HISTORY][DAG:dag_1691565142260_0112_1][Event:TASK_ATTEMPT_FINISHED]: 
vertexName=Map 4, taskAttemptId=attempt_1691565142260_0112_1_00_10_2, 
creationTime=1691962230725, allocationTime=1691962230727, 
startTime=1691962230728, finishTime=1691962230991, timeTaken=263, 
status=FAILED, taskFailureType=NON_FATAL, errorEnum=FRAMEWORK_ERROR, 
diagnostics=Error: Error while running task ( failure ) : 
attempt_1691565142260_0112_1_00_10_2:java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: 
java.lang.NullPointerException
    at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:303)
    at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:254)
    at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
    at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
    at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
    at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
    at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
    at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
    at 
org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:750)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.io.IOException: java.lang.NullPointerException
    at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:80)
    at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
    at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:272)
    ... 15 more
Caused by: java.io.IOException: java.lang.NullPointerException
    at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
    at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
    at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:376)
    at 
org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:82)
    at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:119)
    at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:59)
    at 
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151)
    at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116)
    at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
    ... 17 more
Caused by: java.lang.NullPointerException
    at 
org.apache.hadoop.hive.llap.cache.LowLevelCacheImpl.putFileData(LowLevelCacheImpl.java:300)
    at 
org.apache.hadoop.hive.llap.io.api.impl.LlapIoImpl$GenericDataCache.putFileData(LlapIoImpl.java:303)
    at 
org.apache.hadoop.hive.llap.LlapCacheAwareFs$CacheAwareInputStream.read(LlapCacheAwareFs.java:324)
    at org.apache.commons.io.IOUtils.read(IOUtils.java:1542)
    at org.apache.commons.io.IOUtils.readFully(IOUtils.java:1658)
    at 
org.apache.hadoop.util.ByteBufferIOUtils.readFullyHeapBuffer(ByteBufferIOUtils.java:89)
    at 
org.apache.hadoop.util.ByteBufferIOUtils.readFully(ByteBufferIOUtils.java:53)
    at 
org.apache.hadoop.fs.DefaultMultiByteBufferReader.readFullyIntoBuffers(DefaultMultiByteBufferReader.java:36)
    at 

[jira] [Updated] (HIVE-27647) NullPointerException from LowLevelCacheImpl

2023-08-24 Thread Kwangwon (Trey) Yi (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kwangwon (Trey) Yi updated HIVE-27647:
--
Description: 
Hi all,

 

I've executed Hive using Tez engine, and got the below NPE.

It seems like there was a NullPointerException error from LLAP 
LowLevelCacheImpl class when `putFileData` was called.

What are some possible reasons for this error and suggestions to mitigate it?

 

It would be much appreciated if you could provide me some sort of advice.

Thank you.

 

 
{quote}2023-08-24 10:22:31,122 [INFO] [Dispatcher thread Unknown macro: 
\\{Central}] |HistoryEventHandler.criticalEvents|: 
[HISTORY][DAG:dag_1691565142260_0112_1][Event:TASK_ATTEMPT_FINISHED]: 
vertexName=Map 4, taskAttemptId=attempt_1691565142260_0112_1_00_10_2, 
creationTime=1691962230725, allocationTime=1691962230727, 
startTime=1691962230728, finishTime=1691962230991, timeTaken=263, 
status=FAILED, taskFailureType=NON_FATAL, errorEnum=FRAMEWORK_ERROR, 
diagnostics=Error: Error while running task ( failure ) : 
attempt_1691565142260_0112_1_00_10_2:java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: 
java.lang.NullPointerException
    at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:303)
    at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:254)
    at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
    at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
    at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
    at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
    at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
    at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
    at 
org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:750)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.io.IOException: java.lang.NullPointerException
    at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:80)
    at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
    at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:272)
    ... 15 more
Caused by: java.io.IOException: java.lang.NullPointerException
    at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
    at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
    at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:376)
    at 
org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:82)
    at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:119)
    at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:59)
    at 
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151)
    at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116)
    at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
    ... 17 more
Caused by: java.lang.NullPointerException
    at 
org.apache.hadoop.hive.llap.cache.LowLevelCacheImpl.putFileData(LowLevelCacheImpl.java:300)
    at 
org.apache.hadoop.hive.llap.io.api.impl.LlapIoImpl$GenericDataCache.putFileData(LlapIoImpl.java:303)
    at 
org.apache.hadoop.hive.llap.LlapCacheAwareFs$CacheAwareInputStream.read(LlapCacheAwareFs.java:324)
    at org.apache.commons.io.IOUtils.read(IOUtils.java:1542)
    at org.apache.commons.io.IOUtils.readFully(IOUtils.java:1658)
    at 
org.apache.hadoop.util.ByteBufferIOUtils.readFullyHeapBuffer(ByteBufferIOUtils.java:89)
    at 
org.apache.hadoop.util.ByteBufferIOUtils.readFully(ByteBufferIOUtils.java:53)
    at 
org.apache.hadoop.fs.DefaultMultiByteBufferReader.readFullyIntoBuffers(DefaultMultiByteBufferReader.java:36)
    at 

[jira] [Updated] (HIVE-27647) NullPointerException from LowLevelCacheImpl

2023-08-24 Thread Kwangwon (Trey) Yi (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kwangwon (Trey) Yi updated HIVE-27647:
--
Description: 
Hi all,

 

I've executed Hive using Tez engine, and got the below NPE.

It seems like there was a NullPointerException error from LLAP 
LowLevelCacheImpl class when `putFileData` was called.

What are some possible reasons for this error and suggestions to mitigate it?

 

It would be much appreciated if you could provide me some sort of advice.

Thank you.

 
{color:#b5cea8}2023{color}{color:#d4d4d4}-08-{color}{color:#b5cea8}24{color}{color:#d4d4d4}
 
{color}{color:#b5cea8}10{color}{color:#c586c0}:{color}{color:#b5cea8}22{color}{color:#c586c0}:{color}{color:#b5cea8}31{color}{color:#d4d4d4},{color}{color:#b5cea8}122{color}{color:#d4d4d4}
 [INFO] [{color}{color:#4ec9b0}Dispatcher{color}{color:#d4d4d4} thread 
{color}{color:#4ec9b0}Unknown{color}{color:#d4d4d4} 
{color}{color:#9cdcfe}macro{color}{color:#c586c0}:{color}{color:#d4d4d4} 
\\{Central}] 
|{color}{color:#9cdcfe}HistoryEventHandler{color}{color:#d4d4d4}.{color}{color:#9cdcfe}criticalEvents{color}{color:#d4d4d4}|{color}{color:#c586c0}:{color}{color:#d4d4d4}
 
[HISTORY][DAG{color}{color:#c586c0}:{color}{color:#d4d4d4}dag_1691565142260_0112_1][Event{color}{color:#c586c0}:{color}{color:#d4d4d4}TASK_ATTEMPT_FINISHED]{color}{color:#c586c0}:{color}{color:#d4d4d4}
 vertexName=Map {color}{color:#b5cea8}4{color}{color:#d4d4d4}, 
taskAttemptId=attempt_1691565142260_0112_1_00_10_2, 
creationTime={color}{color:#b5cea8}1691962230725{color}{color:#d4d4d4}, 
allocationTime={color}{color:#b5cea8}1691962230727{color}{color:#d4d4d4}, 
startTime={color}{color:#b5cea8}1691962230728{color}{color:#d4d4d4}, 
finishTime={color}{color:#b5cea8}1691962230991{color}{color:#d4d4d4}, 
timeTaken={color}{color:#b5cea8}263{color}{color:#d4d4d4}, status=FAILED, 
taskFailureType=NON_FATAL, errorEnum=FRAMEWORK_ERROR, 
diagnostics=Error{color}{color:#c586c0}:{color}{color:#d4d4d4} 
{color}{color:#4ec9b0}Error{color}{color:#d4d4d4} 
{color}{color:#c586c0}while{color}{color:#d4d4d4} running 
{color}{color:#dcdcaa}task{color}{color:#d4d4d4} ( failure ) 
{color}{color:#c586c0}:{color}{color:#d4d4d4} 
attempt_1691565142260_0112_1_00_10_2{color}{color:#c586c0}:{color}{color:#9cdcfe}java{color}{color:#d4d4d4}.{color}{color:#9cdcfe}lang{color}{color:#d4d4d4}.{color}{color:#9cdcfe}RuntimeException{color}{color:#c586c0}:{color}{color:#d4d4d4}
 
{color}{color:#9cdcfe}org{color}{color:#d4d4d4}.{color}{color:#9cdcfe}apache{color}{color:#d4d4d4}.{color}{color:#9cdcfe}hadoop{color}{color:#d4d4d4}.{color}{color:#9cdcfe}hive{color}{color:#d4d4d4}.{color}{color:#9cdcfe}ql{color}{color:#d4d4d4}.{color}{color:#9cdcfe}metadata{color}{color:#d4d4d4}.{color}{color:#9cdcfe}HiveException{color}{color:#c586c0}:{color}{color:#d4d4d4}
 
{color}{color:#9cdcfe}java{color}{color:#d4d4d4}.{color}{color:#9cdcfe}io{color}{color:#d4d4d4}.{color}{color:#9cdcfe}IOException{color}{color:#c586c0}:{color}{color:#d4d4d4}
 
{color}{color:#9cdcfe}java{color}{color:#d4d4d4}.{color}{color:#9cdcfe}lang{color}{color:#d4d4d4}.{color}{color:#9cdcfe}NullPointerException{color}
{color:#d4d4d4}at 
{color}{color:#9cdcfe}org{color}{color:#d4d4d4}.{color}{color:#9cdcfe}apache{color}{color:#d4d4d4}.{color}{color:#9cdcfe}hadoop{color}{color:#d4d4d4}.{color}{color:#9cdcfe}hive{color}{color:#d4d4d4}.{color}{color:#9cdcfe}ql{color}{color:#d4d4d4}.{color}{color:#9cdcfe}exec{color}{color:#d4d4d4}.{color}{color:#9cdcfe}tez{color}{color:#d4d4d4}.{color}{color:#9cdcfe}TezProcessor{color}{color:#d4d4d4}.{color}{color:#dcdcaa}initializeAndRunProcessor{color}{color:#d4d4d4}({color}{color:#9cdcfe}TezProcessor{color}{color:#d4d4d4}.{color}{color:#9cdcfe}java{color}{color:#c586c0}:{color}{color:#b5cea8}303{color}{color:#d4d4d4}){color}
{color:#d4d4d4} at 
{color}{color:#9cdcfe}org{color}{color:#d4d4d4}.{color}{color:#9cdcfe}apache{color}{color:#d4d4d4}.{color}{color:#9cdcfe}hadoop{color}{color:#d4d4d4}.{color}{color:#9cdcfe}hive{color}{color:#d4d4d4}.{color}{color:#9cdcfe}ql{color}{color:#d4d4d4}.{color}{color:#9cdcfe}exec{color}{color:#d4d4d4}.{color}{color:#9cdcfe}tez{color}{color:#d4d4d4}.{color}{color:#9cdcfe}TezProcessor{color}{color:#d4d4d4}.{color}{color:#dcdcaa}run{color}{color:#d4d4d4}({color}{color:#9cdcfe}TezProcessor{color}{color:#d4d4d4}.{color}{color:#9cdcfe}java{color}{color:#c586c0}:{color}{color:#b5cea8}254{color}{color:#d4d4d4}){color}
{color:#d4d4d4} at 

[jira] [Updated] (HIVE-27647) NullPointerException from LowLevelCacheImpl

2023-08-24 Thread Kwangwon (Trey) Yi (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kwangwon (Trey) Yi updated HIVE-27647:
--
Description: 
Hi all,

 

I've executed Hive using Tez engine, and got the below NPE.

It seems like there was a NullPointerException error from LLAP 
LowLevelCacheImpl class when `putFileData` was called.

What are some possible reasons for this error and suggestions to mitigate it?

 

It would be much appreciated if you could provide me some sort of advice.

Thank you.

 
{quote}2023-08-24 10:22:31,122 [INFO] [Dispatcher thread Unknown macro: 
\{Central}] |HistoryEventHandler.criticalEvents|: 
[HISTORY][DAG:dag_1691565142260_0112_1][Event:TASK_ATTEMPT_FINISHED]: 
vertexName=Map 4, taskAttemptId=attempt_1691565142260_0112_1_00_10_2, 
creationTime=1691962230725, allocationTime=1691962230727, 
startTime=1691962230728, finishTime=1691962230991, timeTaken=263, 
status=FAILED, taskFailureType=NON_FATAL, errorEnum=FRAMEWORK_ERROR, 
diagnostics=Error: Error while running task ( failure ) : 
attempt_1691565142260_0112_1_00_10_2:java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: 
java.lang.NullPointerException at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:303)
 at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:254) 
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
 at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
 at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
 at java.security.AccessController.doPrivileged(Native Method) at 
javax.security.auth.Subject.doAs(Subject.java:422) at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
 at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
 at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
 at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at 
org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
at java.lang.Thread.run(Thread.java:750) Caused by: 
org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: 
java.lang.NullPointerException at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:80)
 at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
 at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:272)
 ... 15 more Caused by: java.io.IOException: java.lang.NullPointerException at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
 at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
 at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:376)
 at 
org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:82) 
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:119)
 at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:59)
 at 
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151)
 at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116) 
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
 ... 17 more Caused by: java.lang.NullPointerException at 
org.apache.hadoop.hive.llap.cache.LowLevelCacheImpl.putFileData(LowLevelCacheImpl.java:300)
 at 
org.apache.hadoop.hive.llap.io.api.impl.LlapIoImpl$GenericDataCache.putFileData(LlapIoImpl.java:303)
 at 
org.apache.hadoop.hive.llap.LlapCacheAwareFs$CacheAwareInputStream.read(LlapCacheAwareFs.java:324)
 at org.apache.commons.io.IOUtils.read(IOUtils.java:1542) at 
org.apache.commons.io.IOUtils.readFully(IOUtils.java:1658) at 
org.apache.hadoop.util.ByteBufferIOUtils.readFullyHeapBuffer(ByteBufferIOUtils.java:89)
 at 
org.apache.hadoop.util.ByteBufferIOUtils.readFully(ByteBufferIOUtils.java:53) 
at 
org.apache.hadoop.fs.DefaultMultiByteBufferReader.readFullyIntoBuffers(DefaultMultiByteBufferReader.java:36)
 at 
org.apache.hadoop.fs.FSDataInputStream.readFullyIntoBuffers(FSDataInputStream.java:264)
 at 

[jira] [Updated] (HIVE-27647) NullPointerException from LowLevelCacheImpl

2023-08-24 Thread Kwangwon (Trey) Yi (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kwangwon (Trey) Yi updated HIVE-27647:
--
Description: 
Hi all,

 

I've executed Hive using Tez engine, and got the below NPE.

It seems like there was a NullPointerException error from LLAP 
LowLevelCacheImpl class when `putFileData` was called.

What are some possible reasons for this error and suggestions to mitigate it?

 

It would be much appreciated if you could provide me some sort of advice.

Thank you.

 
{quote}2023-08-24 10:22:31,122 [INFO] [Dispatcher thread \\{Central}] 
|HistoryEventHandler.criticalEvents|: 
[HISTORY][DAG:dag_1691565142260_0112_1][Event:TASK_ATTEMPT_FINISHED]: 
vertexName=Map 4, taskAttemptId=attempt_1691565142260_0112_1_00_10_2, 
creationTime=1691962230725, allocationTime=1691962230727, 
startTime=1691962230728, finishTime=1691962230991, timeTaken=263, 
status=FAILED, taskFailureType=NON_FATAL, errorEnum=FRAMEWORK_ERROR, 
diagnostics=Error: Error while running task ( failure ) : 
attempt_1691565142260_0112_1_00_10_2:java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: 
java.lang.NullPointerException at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:303)
 at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:254) 
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
 at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
 at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
 at java.security.AccessController.doPrivileged(Native Method) at 
javax.security.auth.Subject.doAs(Subject.java:422) at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
 at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
 at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
 at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at 
org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
at java.lang.Thread.run(Thread.java:750) Caused by: 
org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: 
java.lang.NullPointerException at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:80)
 at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
 at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:272)
 ... 15 more Caused by: java.io.IOException: java.lang.NullPointerException at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
 at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
 at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:376)
 at 
org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:82) 
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:119)
 at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:59)
 at 
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151)
 at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116) 
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
 ... 17 more Caused by: java.lang.NullPointerException at 
org.apache.hadoop.hive.llap.cache.LowLevelCacheImpl.putFileData(LowLevelCacheImpl.java:300)
 at 
org.apache.hadoop.hive.llap.io.api.impl.LlapIoImpl$GenericDataCache.putFileData(LlapIoImpl.java:303)
 at 
org.apache.hadoop.hive.llap.LlapCacheAwareFs$CacheAwareInputStream.read(LlapCacheAwareFs.java:324)
 at org.apache.commons.io.IOUtils.read(IOUtils.java:1542) at 
org.apache.commons.io.IOUtils.readFully(IOUtils.java:1658) at 
org.apache.hadoop.util.ByteBufferIOUtils.readFullyHeapBuffer(ByteBufferIOUtils.java:89)
 at 
org.apache.hadoop.util.ByteBufferIOUtils.readFully(ByteBufferIOUtils.java:53) 
at 
org.apache.hadoop.fs.DefaultMultiByteBufferReader.readFullyIntoBuffers(DefaultMultiByteBufferReader.java:36)
 at 
org.apache.hadoop.fs.FSDataInputStream.readFullyIntoBuffers(FSDataInputStream.java:264)
 at 

[jira] [Created] (HIVE-27647) NullPointerException from LowLevelCacheImpl

2023-08-24 Thread Kwangwon (Trey) Yi (Jira)
Kwangwon (Trey) Yi created HIVE-27647:
-

 Summary: NullPointerException from LowLevelCacheImpl
 Key: HIVE-27647
 URL: https://issues.apache.org/jira/browse/HIVE-27647
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: 3.1.3
 Environment: Hive: 3.1.3

Tez: 0.9.2
Reporter: Kwangwon (Trey) Yi


Hi all,

 

I've executed Hive using Tez engine, and got the below NPE.

 

It seems like there was a NullPointerException error from LLAP 
LowLevelCacheImpl class when `putFileData` was called.

 

What are some possible reasons for this error and suggestions to mitigate it?

 

It would be much appreciated if you could provide me some sort of advice.

Thank you.

 

```

2023-08-24 10:22:31,122 [INFO] [Dispatcher thread \{Central}] 
|HistoryEventHandler.criticalEvents|: 
[HISTORY][DAG:dag_1691565142260_0112_1][Event:TASK_ATTEMPT_FINISHED]: 
vertexName=Map 4, taskAttemptId=attempt_1691565142260_0112_1_00_10_2, 
creationTime=1691962230725, allocationTime=1691962230727, 
startTime=1691962230728, finishTime=1691962230991, timeTaken=263, 
status=FAILED, taskFailureType=NON_FATAL, errorEnum=FRAMEWORK_ERROR, 
diagnostics=Error: Error while running task ( failure ) : 
attempt_1691565142260_0112_1_00_10_2:java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: 
java.lang.NullPointerException at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:303)
 at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:254) 
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
 at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
 at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
 at java.security.AccessController.doPrivileged(Native Method) at 
javax.security.auth.Subject.doAs(Subject.java:422) at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
 at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
 at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
 at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at 
org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
at java.lang.Thread.run(Thread.java:750) Caused by: 
org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: 
java.lang.NullPointerException at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:80)
 at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
 at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:272)
 ... 15 more Caused by: java.io.IOException: java.lang.NullPointerException at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
 at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
 at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:376)
 at 
org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:82) 
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:119)
 at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:59)
 at 
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151)
 at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116) 
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
 ... 17 more Caused by: java.lang.NullPointerException at 
org.apache.hadoop.hive.llap.cache.LowLevelCacheImpl.putFileData(LowLevelCacheImpl.java:300)
 at 
org.apache.hadoop.hive.llap.io.api.impl.LlapIoImpl$GenericDataCache.putFileData(LlapIoImpl.java:303)
 at 
org.apache.hadoop.hive.llap.LlapCacheAwareFs$CacheAwareInputStream.read(LlapCacheAwareFs.java:324)
 at org.apache.commons.io.IOUtils.read(IOUtils.java:1542) at 
org.apache.commons.io.IOUtils.readFully(IOUtils.java:1658) at 
org.apache.hadoop.util.ByteBufferIOUtils.readFullyHeapBuffer(ByteBufferIOUtils.java:89)
 at 
org.apache.hadoop.util.ByteBufferIOUtils.readFully(ByteBufferIOUtils.java:53) 
at 

[jira] [Resolved] (HIVE-27595) Improve efficiency in the filtering hooks

2023-08-24 Thread Naveen Gangam (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam resolved HIVE-27595.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Fix has been merged to master. Thank you for the patch [~henrib] and reviews 
[~hemanth619] and [~jfs]

> Improve efficiency in the filtering hooks
> -
>
> Key: HIVE-27595
> URL: https://issues.apache.org/jira/browse/HIVE-27595
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 4.0.0-alpha-2
>Reporter: Naveen Gangam
>Assignee: Henri Biestro
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> https://github.com/apache/hive/blob/a406d6d4417277e45b93f1733bed5201afdee29b/ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/metastore/HiveMetaStoreAuthorizer.java#L353-L377
> In case where the tableList has large amounts of tables (tested with 200k in 
> my case), the hivePrivilegedObjects could just as big. So both these lists 
> are 200k. 
> Essentially. the code is trying to return a subset of tableList collection 
> that matches the objects returned in hivePrivilegedObjects. This results in a 
> N*N iteration that causes bad performance. (in my case, the HMS client 
> timeout expired and show tables failed). 
> This code needs to be optimized for performance. 
> we have a similar problem in this code as well.
> ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/AuthorizationMetaStoreFilterHook.java



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27646) Iceberg: Retry query when concurrent write queries fail due to conflicting writes

2023-08-24 Thread Simhadri Govindappa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simhadri Govindappa updated HIVE-27646:
---
Summary: Iceberg: Retry query when concurrent write queries fail due to 
conflicting writes  (was: Iceberg: Retry query when concurrent write queries 
fail due to conflicting write)

> Iceberg: Retry query when concurrent write queries fail due to conflicting 
> writes
> -
>
> Key: HIVE-27646
> URL: https://issues.apache.org/jira/browse/HIVE-27646
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri Govindappa
>Assignee: Simhadri Govindappa
>Priority: Major
>
> Assume two concurrent update queries- Query A and Query B , that have 
> overlapping updates.
> If Query A commits the data and delete files first, then Query B will fail 
> with validation failure due to conflicting writes. 
> In this case, Query B should invalidate the commit files that are already 
> generated and re-execute the full query on the latest snapshot.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27646) Iceberg: Retry query when concurrent write queries fail due to conflicting write

2023-08-24 Thread Simhadri Govindappa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simhadri Govindappa updated HIVE-27646:
---
Description: 
Assume two concurrent update queries- Query A and Query B , that have 
overlapping updates.

If Query A commits the data and delete files first, then Query B will fail with 
validation failure due to conflicting writes. 

In this case, Query B should invalidate the commit files that are already 
generated and re-execute the full query on the latest snapshot.

  was:
During concurrent updates,

Assume 2 concurrent update queries- Query A and Query B that have insersecting 
updates

If Query A commits the data and delet

If any conflicting files are detected during the commit stage of the query that 
commits last,  we will have to re-execute the full query. 


> Iceberg: Retry query when concurrent write queries fail due to conflicting 
> write
> 
>
> Key: HIVE-27646
> URL: https://issues.apache.org/jira/browse/HIVE-27646
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri Govindappa
>Assignee: Simhadri Govindappa
>Priority: Major
>
> Assume two concurrent update queries- Query A and Query B , that have 
> overlapping updates.
> If Query A commits the data and delete files first, then Query B will fail 
> with validation failure due to conflicting writes. 
> In this case, Query B should invalidate the commit files that are already 
> generated and re-execute the full query on the latest snapshot.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27646) Iceberg: Retry query when concurrent write queries fail due to conflicting write

2023-08-24 Thread Simhadri Govindappa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simhadri Govindappa updated HIVE-27646:
---
Description: 
During concurrent updates,

Assume 2 concurrent update queries- Query A and Query B that have insersecting 
updates

If Query A commits the data and delet

If any conflicting files are detected during the commit stage of the query that 
commits last,  we will have to re-execute the full query. 

  was:
During concurrent updates,

Assume 2 concurrent update quries- Query A

If any conflicting files are detected during the commit stage of the query that 
commits last,  we will have to re-execute the full query. 


> Iceberg: Retry query when concurrent write queries fail due to conflicting 
> write
> 
>
> Key: HIVE-27646
> URL: https://issues.apache.org/jira/browse/HIVE-27646
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri Govindappa
>Assignee: Simhadri Govindappa
>Priority: Major
>
> During concurrent updates,
> Assume 2 concurrent update queries- Query A and Query B that have 
> insersecting updates
> If Query A commits the data and delet
> If any conflicting files are detected during the commit stage of the query 
> that commits last,  we will have to re-execute the full query. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27646) Iceberg: Retry query when concurrent write queries fail due to conflicting write

2023-08-24 Thread Simhadri Govindappa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simhadri Govindappa updated HIVE-27646:
---
Description: 
During concurrent updates,

Assume 2 concurrent update quries- Query A

If any conflicting files are detected during the commit stage of the query that 
commits last,  we will have to re-execute the full query. 

  was:
During concurrent updates,

If any conflicting files are detected during the commit stage of the query that 
commits last , we will have to re-execuete the full query. 


> Iceberg: Retry query when concurrent write queries fail due to conflicting 
> write
> 
>
> Key: HIVE-27646
> URL: https://issues.apache.org/jira/browse/HIVE-27646
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri Govindappa
>Assignee: Simhadri Govindappa
>Priority: Major
>
> During concurrent updates,
> Assume 2 concurrent update quries- Query A
> If any conflicting files are detected during the commit stage of the query 
> that commits last,  we will have to re-execute the full query. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27646) Iceberg: Retry query when concurrent write queries fail due to conflicting write

2023-08-24 Thread Simhadri Govindappa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simhadri Govindappa updated HIVE-27646:
---
Description: 
During concurrent updates,

If any conflicting files are detected during the commit stage of the query that 
commits last , we will have to re-execuete the full query. 

> Iceberg: Retry query when concurrent write queries fail due to conflicting 
> write
> 
>
> Key: HIVE-27646
> URL: https://issues.apache.org/jira/browse/HIVE-27646
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri Govindappa
>Assignee: Simhadri Govindappa
>Priority: Major
>
> During concurrent updates,
> If any conflicting files are detected during the commit stage of the query 
> that commits last , we will have to re-execuete the full query. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27646) Iceberg: Retry query when concurrent write queries fail due to conflicting write

2023-08-24 Thread Simhadri Govindappa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simhadri Govindappa updated HIVE-27646:
---
Summary: Iceberg: Retry query when concurrent write queries fail due to 
conflicting write  (was: Iceberg: Re-execute query when concurrent writes fail 
due to conflicting write)

> Iceberg: Retry query when concurrent write queries fail due to conflicting 
> write
> 
>
> Key: HIVE-27646
> URL: https://issues.apache.org/jira/browse/HIVE-27646
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri Govindappa
>Assignee: Simhadri Govindappa
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27646) Iceberg: Re-execute query when concurrent writes fail due to conflicting write

2023-08-24 Thread Simhadri Govindappa (Jira)
Simhadri Govindappa created HIVE-27646:
--

 Summary: Iceberg: Re-execute query when concurrent writes fail due 
to conflicting write
 Key: HIVE-27646
 URL: https://issues.apache.org/jira/browse/HIVE-27646
 Project: Hive
  Issue Type: Improvement
Reporter: Simhadri Govindappa
Assignee: Simhadri Govindappa






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-27640) Counter for query concurrency

2023-08-24 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-27640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor reassigned HIVE-27640:
---

Assignee: László Bodor

> Counter for query concurrency
> -
>
> Key: HIVE-27640
> URL: https://issues.apache.org/jira/browse/HIVE-27640
> Project: Hive
>  Issue Type: Sub-task
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>
> This is kind of hard to catch easily, but I would like to see 
> something/anything about query concurrency in the query counters. This way we 
> can instantly see in the query summary what happened. I mean counters like:
> 1. how many queries were running when this query arrived
> 2. same as 1) but in query stage level
> 2a) how many queries were being compiled (or waiting for compilation) when 
> this query started to compile (or started to enqueued for compilation)
> 2b) how many queries were waiting for a coordinator when this query started 
> to get a coordinator
> 2c) how many queries were in the Run DAG phase, when this query started to 
> run DAG



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-27606) Backport of HIVE-21171: Skip creating scratch dirs for tez if RPC is on

2023-08-24 Thread Sankar Hariappan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan resolved HIVE-27606.
-
Fix Version/s: 3.2.0
   Resolution: Fixed

> Backport of HIVE-21171: Skip creating scratch dirs for tez if RPC is on
> ---
>
> Key: HIVE-27606
> URL: https://issues.apache.org/jira/browse/HIVE-27606
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Aman Raj
>Assignee: Aman Raj
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.2.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-27607) Backport of HIVE-21182 Skip setting up hive scratch dir during planning

2023-08-24 Thread Sankar Hariappan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan resolved HIVE-27607.
-
Fix Version/s: 3.2.0
   Resolution: Fixed

> Backport of HIVE-21182 Skip setting up hive scratch dir during planning
> ---
>
> Key: HIVE-27607
> URL: https://issues.apache.org/jira/browse/HIVE-27607
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Aman Raj
>Assignee: Aman Raj
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.2.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27630) Iceberg: Fast forward branch

2023-08-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-27630:
--
Labels: pull-request-available  (was: )

> Iceberg: Fast forward branch
> 
>
> Key: HIVE-27630
> URL: https://issues.apache.org/jira/browse/HIVE-27630
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Denys Kuzmenko
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
>
> Add support to fastForward main branch to the head of feature-branch to 
> update the main table state.
> {code}
> table.manageSnapshots().fastForward("main", "feature-branch").commit()
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27643) Exclude compaction queries from ranger policies

2023-08-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-27643:
--
Labels: pull-request-available  (was: )

> Exclude compaction queries from ranger policies
> ---
>
> Key: HIVE-27643
> URL: https://issues.apache.org/jira/browse/HIVE-27643
> Project: Hive
>  Issue Type: Bug
>Reporter: László Végh
>Assignee: László Végh
>Priority: Critical
>  Labels: pull-request-available
>
> Applying masking or filtering Ranger policies on the compaction users cause 
> data loss, as the policies will be applied to the compaction queries also.
> While this is a kind of misconfiguration, the result is so bad, that the 
> users should be protected from it by automatically excluding compaction 
> queries from ALL ranger policies.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-27638) Preparing for 4.0.0-beta-2 development

2023-08-24 Thread Stamatis Zampetakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis resolved HIVE-27638.

Fix Version/s: 4.0.0
   Resolution: Fixed

Fixed in 
https://github.com/apache/hive/commit/1ebef40aba00c4ec5376d8f0623196b42425a589. 
Thanks for the reviews [~ayushsaxena], [~aturoczy].

> Preparing for 4.0.0-beta-2 development
> --
>
> Key: HIVE-27638
> URL: https://issues.apache.org/jira/browse/HIVE-27638
> Project: Hive
>  Issue Type: Task
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> The main goal of this ticket is to increment the version and add the 
> necessary metastore upgrade scripts so we don't lose track of what changed 
> after the beta-1 release.
> If later we decide to use another name (other than beta-2) that would be 
> completely fine (and hopefully a simple rename would do). The most important 
> thing in this change is to have the scripts in place so we don't mess up when 
> we push changes to the metastore schema.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27639) Query level performance counters

2023-08-24 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-27639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-27639:

Summary: Query level performance counters  (was: Performance counters for 
easier investigations)

> Query level performance counters
> 
>
> Key: HIVE-27639
> URL: https://issues.apache.org/jira/browse/HIVE-27639
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Priority: Major
>
> We need to move performance measurements to the next level and implement 
> whatever is needed to make us able to find easier answers to problems like 
> “query is slow”. The problem is that we keep digging into logs + watching 
> metrics that are provided by *something* in the environment (that can be 
> anything that the actual vendor implements outside of hive).
> Let's try to localize the environment problems to the interval of the slow 
> query and make it exposed through counters.
> Also, let's keep in mind that performance measurements ideally should never 
> cause performance problems itself: heavyweight measurements should be 
> disabled by default.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27630) Iceberg: Fast forward branch

2023-08-24 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HIVE-27630:

Summary: Iceberg: Fast forward branch  (was: Iceberg: Fast forward/rebase 
branch)

> Iceberg: Fast forward branch
> 
>
> Key: HIVE-27630
> URL: https://issues.apache.org/jira/browse/HIVE-27630
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Denys Kuzmenko
>Assignee: Ayush Saxena
>Priority: Major
>
> Add support to fastForward main branch to the head of feature-branch to 
> update the main table state.
> {code}
> table.manageSnapshots().fastForward("main", "feature-branch").commit()
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27645) Clean test cases by refactoring assertFalse(equals()) using assertNotEquals & @Test(excepted) using assertThrows

2023-08-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-27645:
--
Labels: pull-request-available  (was: )

> Clean test cases by refactoring assertFalse(equals()) using assertNotEquals & 
> @Test(excepted) using assertThrows
> 
>
> Key: HIVE-27645
> URL: https://issues.apache.org/jira/browse/HIVE-27645
> Project: Hive
>  Issue Type: Improvement
>Reporter: Taher Ghaleb
>Priority: Minor
>  Labels: pull-request-available
>
> I am working on research that investigates test smell refactoring in which we 
> identify alternative implementations of test cases, study how commonly used 
> these refactorings are, and assess how acceptable they are in practice.
> The first smell is when inappropriate assertions are used, while there exist 
> better alternatives. For example, {{{_}assertNotEquals(x, y){_};}} is more 
> appropriate to use instead of {_}{{assertFalse(x.equals( y );}}{_}.
> The second smell is when exception handling can alternatively be implemented 
> using assertion rather than annotation. For example, 
> _{{{}assertThrows({}}}{{{}Exception{}}}{{{}.class, () -> \{...});{}}}_ is 
> more appropriate to use instead of  {{{}_@Test(expected = 
> Exception.class)_{}}}.
> While there could be several cases like this, we aim in this pull request to 
> get your feedback on these particular test smells and their refactorings. 
> Thanks in advance for your input.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27645) Clean test cases by refactoring assertFalse(equals()) using assertNotEquals & @Test(excepted) using assertThrows

2023-08-24 Thread Taher Ghaleb (Jira)
Taher Ghaleb created HIVE-27645:
---

 Summary: Clean test cases by refactoring assertFalse(equals()) 
using assertNotEquals & @Test(excepted) using assertThrows
 Key: HIVE-27645
 URL: https://issues.apache.org/jira/browse/HIVE-27645
 Project: Hive
  Issue Type: Improvement
Reporter: Taher Ghaleb


I am working on research that investigates test smell refactoring in which we 
identify alternative implementations of test cases, study how commonly used 
these refactorings are, and assess how acceptable they are in practice.

The first smell is when inappropriate assertions are used, while there exist 
better alternatives. For example, {{{_}assertNotEquals(x, y){_};}} is more 
appropriate to use instead of {_}{{assertFalse(x.equals(y));}}{_}.

The second smell is when exception handling can alternatively be implemented 
using assertion rather than annotation. For example, 
_{{{}assertThrows({}}}{{{}Exception{}}}{{{}.class, () -> \{...});{}}}_ is more 
appropriate to use instead of  {{{}_@Test(expected = Exception.class)_{}}}.

While there could be several cases like this, we aim in this pull request to 
get your feedback on these particular test smells and their refactorings. 
Thanks in advance for your input.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27645) Clean test cases by refactoring assertFalse(equals()) using assertNotEquals & @Test(excepted) using assertThrows

2023-08-24 Thread Taher Ghaleb (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taher Ghaleb updated HIVE-27645:

Description: 
I am working on research that investigates test smell refactoring in which we 
identify alternative implementations of test cases, study how commonly used 
these refactorings are, and assess how acceptable they are in practice.

The first smell is when inappropriate assertions are used, while there exist 
better alternatives. For example, {{{_}assertNotEquals(x, y){_};}} is more 
appropriate to use instead of {_}{{assertFalse(x.equals( y );}}{_}.

The second smell is when exception handling can alternatively be implemented 
using assertion rather than annotation. For example, 
_{{{}assertThrows({}}}{{{}Exception{}}}{{{}.class, () -> \{...});{}}}_ is more 
appropriate to use instead of  {{{}_@Test(expected = Exception.class)_{}}}.

While there could be several cases like this, we aim in this pull request to 
get your feedback on these particular test smells and their refactorings. 
Thanks in advance for your input.

  was:
I am working on research that investigates test smell refactoring in which we 
identify alternative implementations of test cases, study how commonly used 
these refactorings are, and assess how acceptable they are in practice.

The first smell is when inappropriate assertions are used, while there exist 
better alternatives. For example, {{{_}assertNotEquals(x, y){_};}} is more 
appropriate to use instead of {_}{{assertFalse(x.equals(y));}}{_}.

The second smell is when exception handling can alternatively be implemented 
using assertion rather than annotation. For example, 
_{{{}assertThrows({}}}{{{}Exception{}}}{{{}.class, () -> \{...});{}}}_ is more 
appropriate to use instead of  {{{}_@Test(expected = Exception.class)_{}}}.

While there could be several cases like this, we aim in this pull request to 
get your feedback on these particular test smells and their refactorings. 
Thanks in advance for your input.


> Clean test cases by refactoring assertFalse(equals()) using assertNotEquals & 
> @Test(excepted) using assertThrows
> 
>
> Key: HIVE-27645
> URL: https://issues.apache.org/jira/browse/HIVE-27645
> Project: Hive
>  Issue Type: Improvement
>Reporter: Taher Ghaleb
>Priority: Minor
>
> I am working on research that investigates test smell refactoring in which we 
> identify alternative implementations of test cases, study how commonly used 
> these refactorings are, and assess how acceptable they are in practice.
> The first smell is when inappropriate assertions are used, while there exist 
> better alternatives. For example, {{{_}assertNotEquals(x, y){_};}} is more 
> appropriate to use instead of {_}{{assertFalse(x.equals( y );}}{_}.
> The second smell is when exception handling can alternatively be implemented 
> using assertion rather than annotation. For example, 
> _{{{}assertThrows({}}}{{{}Exception{}}}{{{}.class, () -> \{...});{}}}_ is 
> more appropriate to use instead of  {{{}_@Test(expected = 
> Exception.class)_{}}}.
> While there could be several cases like this, we aim in this pull request to 
> get your feedback on these particular test smells and their refactorings. 
> Thanks in advance for your input.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)