[jira] [Commented] (HIVE-27650) Oracle init-db is flaky
[ https://issues.apache.org/jira/browse/HIVE-27650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758748#comment-17758748 ] Ayush Saxena commented on HIVE-27650: - Similar to commit: 34b24d55ade393673424f077b69add43bad9f731 which disabled mysql, I plan to disable the oracle docker for now till we figure out the reason for the failure, It fails pretty often & leads to retrigers and ultimately resource wastage > Oracle init-db is flaky > --- > > Key: HIVE-27650 > URL: https://issues.apache.org/jira/browse/HIVE-27650 > Project: Hive > Issue Type: Bug >Reporter: Ayush Saxena >Priority: Major > > The oracle docker in hive precommit fails very often > eg. > http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-4578/2/pipeline/462 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-27650) Oracle init-db is flaky
Ayush Saxena created HIVE-27650: --- Summary: Oracle init-db is flaky Key: HIVE-27650 URL: https://issues.apache.org/jira/browse/HIVE-27650 Project: Hive Issue Type: Bug Reporter: Ayush Saxena The oracle docker in hive precommit fails very often eg. http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-4578/2/pipeline/462 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27646) Iceberg: Retry query when concurrent write queries fail due to conflicting writes
[ https://issues.apache.org/jira/browse/HIVE-27646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-27646: -- Labels: pull-request-available (was: ) > Iceberg: Retry query when concurrent write queries fail due to conflicting > writes > - > > Key: HIVE-27646 > URL: https://issues.apache.org/jira/browse/HIVE-27646 > Project: Hive > Issue Type: Improvement >Reporter: Simhadri Govindappa >Assignee: Simhadri Govindappa >Priority: Major > Labels: pull-request-available > > Assume two concurrent update queries- Query A and Query B , that have > overlapping updates. > If Query A commits the data and delete files first, then Query B will fail > with validation failure due to conflicting writes. > In this case, Query B should invalidate the commit files that are already > generated and re-execute the full query on the latest snapshot. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27532) Missing semicolon in show create table and show create database output
[ https://issues.apache.org/jira/browse/HIVE-27532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Soumyakanti Das updated HIVE-27532: --- Description: When we run SHOW CREATE TABLE on a table with constraints, it doesn't add a semi-colon to the create table ddl. Here's an output for tpcds table reason: {code:java} CREATE TABLE `reason`( `r_reason_sk` int, `r_reason_id` string, `r_reason_desc` string) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' WITH SERDEPROPERTIES ( 'field.delim'='|', 'serialization.format'='|') STORED AS ORC TBLPROPERTIES ( 'transactional'='true', 'OBJCAPABILITIES'='EXTREAD,EXTWRITE', 'STATS_GENERATED'='TASK', 'impala.lastComputeStatsTime'='1674074181', 'serialization.null.format'='', 'transient_lastDdlTime'='1674073496') ALTER TABLE reason ADD CONSTRAINT 2e47abb2-b6c7-450a-8229-395d6b1ff168 PRIMARY KEY (r_reason_sk) DISABLE NOVALIDATE RELY; {code} For completeness, we should also add a semicolon to the SHOW CREATE DATABASE output. was: When we run SHOW CREATE TABLE on a table with constraints, it doesn't add a semi-colon to the create table ddl. Here's an output for tpcds table reason: {code:java} CREATE TABLE `reason`( `r_reason_sk` int, `r_reason_id` string, `r_reason_desc` string) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' WITH SERDEPROPERTIES ( 'field.delim'='|', 'serialization.format'='|') STORED AS ORC TBLPROPERTIES ( 'transactional'='true', 'OBJCAPABILITIES'='EXTREAD,EXTWRITE', 'STATS_GENERATED'='TASK', 'impala.lastComputeStatsTime'='1674074181', 'serialization.null.format'='', 'transient_lastDdlTime'='1674073496') ALTER TABLE reason ADD CONSTRAINT 2e47abb2-b6c7-450a-8229-395d6b1ff168 PRIMARY KEY (r_reason_sk) DISABLE NOVALIDATE RELY; {code} > Missing semicolon in show create table and show create database output > -- > > Key: HIVE-27532 > URL: https://issues.apache.org/jira/browse/HIVE-27532 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Soumyakanti Das >Assignee: Soumyakanti Das >Priority: Major > Labels: pull-request-available > > When we run SHOW CREATE TABLE on a table with constraints, it doesn't add a > semi-colon to the create table ddl. Here's an output for tpcds table reason: > {code:java} > CREATE TABLE `reason`( > `r_reason_sk` int, > `r_reason_id` string, > `r_reason_desc` string) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > WITH SERDEPROPERTIES ( > 'field.delim'='|', > 'serialization.format'='|') > STORED AS ORC > TBLPROPERTIES ( > 'transactional'='true', > 'OBJCAPABILITIES'='EXTREAD,EXTWRITE', > 'STATS_GENERATED'='TASK', > 'impala.lastComputeStatsTime'='1674074181', > 'serialization.null.format'='', > 'transient_lastDdlTime'='1674073496') > ALTER TABLE reason ADD CONSTRAINT 2e47abb2-b6c7-450a-8229-395d6b1ff168 > PRIMARY KEY (r_reason_sk) DISABLE NOVALIDATE RELY; {code} > > For completeness, we should also add a semicolon to the SHOW CREATE DATABASE > output. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27532) Missing semicolon in show create table/database output
[ https://issues.apache.org/jira/browse/HIVE-27532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Soumyakanti Das updated HIVE-27532: --- Summary: Missing semicolon in show create table/database output (was: Missing semicolon in show create table output) > Missing semicolon in show create table/database output > -- > > Key: HIVE-27532 > URL: https://issues.apache.org/jira/browse/HIVE-27532 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Soumyakanti Das >Assignee: Soumyakanti Das >Priority: Major > Labels: pull-request-available > > When we run SHOW CREATE TABLE on a table with constraints, it doesn't add a > semi-colon to the create table ddl. Here's an output for tpcds table reason: > {code:java} > CREATE TABLE `reason`( > `r_reason_sk` int, > `r_reason_id` string, > `r_reason_desc` string) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > WITH SERDEPROPERTIES ( > 'field.delim'='|', > 'serialization.format'='|') > STORED AS ORC > TBLPROPERTIES ( > 'transactional'='true', > 'OBJCAPABILITIES'='EXTREAD,EXTWRITE', > 'STATS_GENERATED'='TASK', > 'impala.lastComputeStatsTime'='1674074181', > 'serialization.null.format'='', > 'transient_lastDdlTime'='1674073496') > ALTER TABLE reason ADD CONSTRAINT 2e47abb2-b6c7-450a-8229-395d6b1ff168 > PRIMARY KEY (r_reason_sk) DISABLE NOVALIDATE RELY; {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27532) Missing semicolon in show create table and show create database output
[ https://issues.apache.org/jira/browse/HIVE-27532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Soumyakanti Das updated HIVE-27532: --- Summary: Missing semicolon in show create table and show create database output (was: Missing semicolon in show create table/database output) > Missing semicolon in show create table and show create database output > -- > > Key: HIVE-27532 > URL: https://issues.apache.org/jira/browse/HIVE-27532 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Soumyakanti Das >Assignee: Soumyakanti Das >Priority: Major > Labels: pull-request-available > > When we run SHOW CREATE TABLE on a table with constraints, it doesn't add a > semi-colon to the create table ddl. Here's an output for tpcds table reason: > {code:java} > CREATE TABLE `reason`( > `r_reason_sk` int, > `r_reason_id` string, > `r_reason_desc` string) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > WITH SERDEPROPERTIES ( > 'field.delim'='|', > 'serialization.format'='|') > STORED AS ORC > TBLPROPERTIES ( > 'transactional'='true', > 'OBJCAPABILITIES'='EXTREAD,EXTWRITE', > 'STATS_GENERATED'='TASK', > 'impala.lastComputeStatsTime'='1674074181', > 'serialization.null.format'='', > 'transient_lastDdlTime'='1674073496') > ALTER TABLE reason ADD CONSTRAINT 2e47abb2-b6c7-450a-8229-395d6b1ff168 > PRIMARY KEY (r_reason_sk) DISABLE NOVALIDATE RELY; {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (HIVE-27649) Subqueries with a set operator do not support order by clauses
[ https://issues.apache.org/jira/browse/HIVE-27649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758701#comment-17758701 ] Nicolas Richard edited comment on HIVE-27649 at 8/24/23 7:08 PM: - I did some investigation to figure out what was going on. In https://issues.apache.org/jira/browse/HIVE-21980, the grammar [changed a bit|https://github.com/apache/hive/commit/0f39030c3d33b11ae9c14ac81e047b44e8695371] from: {code:java} atomjoinSource @init { gParent.pushMsg("joinSource", state); } @after { gParent.popMsg(state); } : tableSource (lateralView^)* | virtualTableSource (lateralView^)* | (subQuerySource) => subQuerySource (lateralView^)* | partitionedTableFunction (lateralView^)* | LPAREN! joinSource RPAREN! ;{code} to {code:java} atomjoinSource @init { gParent.pushMsg("joinSource", state); } @after { gParent.popMsg(state); } : tableSource (lateralView^)* | virtualTableSource (lateralView^)* | (LPAREN (KW_WITH|KW_SELECT|KW_MAP|KW_REDUCE|KW_FROM)) => subQuerySource (lateralView^)* | (LPAREN LPAREN atomSelectStatement RPAREN setOperator ) => subQuerySource (lateralView^)* | partitionedTableFunction (lateralView^)* | LPAREN! joinSource RPAREN! ;{code} When the query is parsed, we end up in the subQuerySource rule because atomSelectStatement, by definition, cannot contain SORT BY, CLUSTER BY, DISTRIBUTE BY or LIMIT clauses. An exception is thrown because subQuerySource requires an identifier which is not present nor needed in this particular scenario. I tested it locally and changing _atomSelectStatement_ to _selectStatement_ solves the issue. However, I still need to validate that it does not have side-effects by running the whole test suite. was (Author: JIRAUSER298135): I did some investigation to figure out what was going on. In https://issues.apache.org/jira/browse/HIVE-21980, the grammar [changed a bit|https://github.com/apache/hive/commit/0f39030c3d33b11ae9c14ac81e047b44e8695371] from: {code:java} atomjoinSource @init { gParent.pushMsg("joinSource", state); } @after { gParent.popMsg(state); } : tableSource (lateralView^)* | virtualTableSource (lateralView^)* | (subQuerySource) => subQuerySource (lateralView^)* | partitionedTableFunction (lateralView^)* | LPAREN! joinSource RPAREN! ;{code} to {code:java} atomjoinSource @init { gParent.pushMsg("joinSource", state); } @after { gParent.popMsg(state); } : tableSource (lateralView^)* | virtualTableSource (lateralView^)* | (LPAREN (KW_WITH|KW_SELECT|KW_MAP|KW_REDUCE|KW_FROM)) => subQuerySource (lateralView^)* | (LPAREN LPAREN atomSelectStatement RPAREN setOperator ) => subQuerySource (lateralView^)* | partitionedTableFunction (lateralView^)* | LPAREN! joinSource RPAREN! ;{code} When the query is parsed, we end up in the subQuerySource rule because atomSelectStatement, by definition, cannot contain SORT BY, CLUSTER BY, DISTRIBUTE BY or LIMIT clauses. An exception is thrown because subQuerySource requires an identifier which is not present nor needed in this particular scenario. I tested it locally and changing _atomSelectStatement_ to _selectStatement_ solves the issue. However, I still need to validate that it does not have side-effects by running the whole test suite. > Subqueries with a set operator do not support order by clauses > -- > > Key: HIVE-27649 > URL: https://issues.apache.org/jira/browse/HIVE-27649 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 3.1.2, 4.0.0 >Reporter: Nicolas Richard >Priority: Major > Labels: pull-request-available > > Consider the following query: > {code:java} > select key from ((select key from src order by key) union (select key from > src))subq {code} > Up until 3.1.2, Hive would parse this query without any problems. However, if > you try it on the latest versions, you'll get the following exception: > {code:java} > org.apache.hadoop.hive.ql.parse.ParseException: line 1:60 cannot recognize > input near 'union' '(' 'select' in subquery source > at > org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:125) > at > org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:97) {code} > With the inner exception stack trace being: > {code:java} > NoViableAltException(367@[]) > at > org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.identifier(HiveParser_IdentifiersParser.java:14006) > at > org.apache.hadoop.hive.ql.parse.HiveParser.identifier(HiveParser.java:45086) > at > org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.subQuerySource(HiveParser_FromClauseParser.java:5411) > at >
[jira] [Comment Edited] (HIVE-27649) Subqueries with a set operator do not support order by clauses
[ https://issues.apache.org/jira/browse/HIVE-27649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758701#comment-17758701 ] Nicolas Richard edited comment on HIVE-27649 at 8/24/23 6:57 PM: - I did some investigation to figure out what was going on. In https://issues.apache.org/jira/browse/HIVE-21980, the grammar [changed a bit|https://github.com/apache/hive/commit/0f39030c3d33b11ae9c14ac81e047b44e8695371] from: {code:java} atomjoinSource @init { gParent.pushMsg("joinSource", state); } @after { gParent.popMsg(state); } : tableSource (lateralView^)* | virtualTableSource (lateralView^)* | (subQuerySource) => subQuerySource (lateralView^)* | partitionedTableFunction (lateralView^)* | LPAREN! joinSource RPAREN! ;{code} to {code:java} atomjoinSource @init { gParent.pushMsg("joinSource", state); } @after { gParent.popMsg(state); } : tableSource (lateralView^)* | virtualTableSource (lateralView^)* | (LPAREN (KW_WITH|KW_SELECT|KW_MAP|KW_REDUCE|KW_FROM)) => subQuerySource (lateralView^)* | (LPAREN LPAREN atomSelectStatement RPAREN setOperator ) => subQuerySource (lateralView^)* | partitionedTableFunction (lateralView^)* | LPAREN! joinSource RPAREN! ;{code} When the query is parsed, we end up in the subQuerySource rule because atomSelectStatement, by definition, cannot contain SORT BY, CLUSTER BY, DISTRIBUTE BY or LIMIT clauses. An exception is thrown because subQuerySource requires an identifier which is not present nor needed in this particular scenario. I tested it locally and changing _atomSelectStatement_ to _selectStatement_ solves the issue. However, I still need to validate that it does not have side-effects by running the whole test suite. was (Author: JIRAUSER298135): I did some investigation to figure out what was going on. In https://issues.apache.org/jira/browse/HIVE-21980, the grammar [changed a bit|https://github.com/apache/hive/commit/0f39030c3d33b11ae9c14ac81e047b44e8695371] from: {code:java} atomjoinSource @init { gParent.pushMsg("joinSource", state); } @after { gParent.popMsg(state); } : tableSource (lateralView^)* | virtualTableSource (lateralView^)* | (subQuerySource) => subQuerySource (lateralView^)* | partitionedTableFunction (lateralView^)* | LPAREN! joinSource RPAREN! ;{code} to {code:java} atomjoinSource @init { gParent.pushMsg("joinSource", state); } @after { gParent.popMsg(state); } : tableSource (lateralView^)* | virtualTableSource (lateralView^)* | (LPAREN (KW_WITH|KW_SELECT|KW_MAP|KW_REDUCE|KW_FROM)) => subQuerySource (lateralView^)* | (LPAREN LPAREN atomSelectStatement RPAREN setOperator ) => subQuerySource (lateralView^)* | partitionedTableFunction (lateralView^)* | LPAREN! joinSource RPAREN! ;{code} When the query is parsed, we end up in the subQuerySource rule because atomSelectStatement, by definition, cannot contain SORT BY, CLUSTER BY, DISTRIBUTE BY or LIMIT clauses. An exception is thrown because subQuerySource requires an identifier which is not present nor needed in this particular scenario. I tested it locally and changing _atomSelectStatement_ to _selectStatement_ solves the issue. However, I still need to validate that it does not have side-effects by running the whole test suite. > Subqueries with a set operator do not support order by clauses > -- > > Key: HIVE-27649 > URL: https://issues.apache.org/jira/browse/HIVE-27649 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 3.1.2, 4.0.0 >Reporter: Nicolas Richard >Priority: Major > Labels: pull-request-available > > Consider the following query: > {code:java} > select key from ((select key from src order by key) union (select key from > src))subq {code} > Up until 3.1.2, Hive would parse this query without any problems. However, if > you try it on the latest versions, you'll get the following exception: > {code:java} > org.apache.hadoop.hive.ql.parse.ParseException: line 1:60 cannot recognize > input near 'union' '(' 'select' in subquery source > at > org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:125) > at > org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:97) {code} > With the inner exception stack trace being: > {code:java} > NoViableAltException(367@[]) > at > org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.identifier(HiveParser_IdentifiersParser.java:14006) > at > org.apache.hadoop.hive.ql.parse.HiveParser.identifier(HiveParser.java:45086) > at >
[jira] [Commented] (HIVE-27649) Subqueries with a set operator do not support order by clauses
[ https://issues.apache.org/jira/browse/HIVE-27649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758705#comment-17758705 ] Nicolas Richard commented on HIVE-27649: Proposed fix: [https://github.com/apache/hive/pull/4628] (still in Draft) > Subqueries with a set operator do not support order by clauses > -- > > Key: HIVE-27649 > URL: https://issues.apache.org/jira/browse/HIVE-27649 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 3.1.2, 4.0.0 >Reporter: Nicolas Richard >Priority: Major > Labels: pull-request-available > > Consider the following query: > {code:java} > select key from ((select key from src order by key) union (select key from > src))subq {code} > Up until 3.1.2, Hive would parse this query without any problems. However, if > you try it on the latest versions, you'll get the following exception: > {code:java} > org.apache.hadoop.hive.ql.parse.ParseException: line 1:60 cannot recognize > input near 'union' '(' 'select' in subquery source > at > org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:125) > at > org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:97) {code} > With the inner exception stack trace being: > {code:java} > NoViableAltException(367@[]) > at > org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.identifier(HiveParser_IdentifiersParser.java:14006) > at > org.apache.hadoop.hive.ql.parse.HiveParser.identifier(HiveParser.java:45086) > at > org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.subQuerySource(HiveParser_FromClauseParser.java:5411) > at > org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:1921) > at > org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:2175) > at > org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:2110) > at > org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:2175) > at > org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromSource(HiveParser_FromClauseParser.java:1750) > at > org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromClause(HiveParser_FromClauseParser.java:1593) > at > org.apache.hadoop.hive.ql.parse.HiveParser.fromClause(HiveParser.java:45094) > at > org.apache.hadoop.hive.ql.parse.HiveParser.atomSelectStatement(HiveParser.java:38538) > at > org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:38831) > at > org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:38424) > at > org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:37686) > at > org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:37574) > at > org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:2757) > at > org.apache.hadoop.hive.ql.parse.HiveParser.explainStatement(HiveParser.java:1751) > at > org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1614) > at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:123) > at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:97) > {code} > Note that this behavior also happens if the subquery contains a SORT BY, > CLUSTER BY, DISTRIBUTE BY or LIMIT clause. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27649) Subqueries with a set operator do not support order by clauses
[ https://issues.apache.org/jira/browse/HIVE-27649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-27649: -- Labels: pull-request-available (was: ) > Subqueries with a set operator do not support order by clauses > -- > > Key: HIVE-27649 > URL: https://issues.apache.org/jira/browse/HIVE-27649 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 3.1.2, 4.0.0 >Reporter: Nicolas Richard >Priority: Major > Labels: pull-request-available > > Consider the following query: > {code:java} > select key from ((select key from src order by key) union (select key from > src))subq {code} > Up until 3.1.2, Hive would parse this query without any problems. However, if > you try it on the latest versions, you'll get the following exception: > {code:java} > org.apache.hadoop.hive.ql.parse.ParseException: line 1:60 cannot recognize > input near 'union' '(' 'select' in subquery source > at > org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:125) > at > org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:97) {code} > With the inner exception stack trace being: > {code:java} > NoViableAltException(367@[]) > at > org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.identifier(HiveParser_IdentifiersParser.java:14006) > at > org.apache.hadoop.hive.ql.parse.HiveParser.identifier(HiveParser.java:45086) > at > org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.subQuerySource(HiveParser_FromClauseParser.java:5411) > at > org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:1921) > at > org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:2175) > at > org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:2110) > at > org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:2175) > at > org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromSource(HiveParser_FromClauseParser.java:1750) > at > org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromClause(HiveParser_FromClauseParser.java:1593) > at > org.apache.hadoop.hive.ql.parse.HiveParser.fromClause(HiveParser.java:45094) > at > org.apache.hadoop.hive.ql.parse.HiveParser.atomSelectStatement(HiveParser.java:38538) > at > org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:38831) > at > org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:38424) > at > org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:37686) > at > org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:37574) > at > org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:2757) > at > org.apache.hadoop.hive.ql.parse.HiveParser.explainStatement(HiveParser.java:1751) > at > org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1614) > at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:123) > at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:97) > {code} > Note that this behavior also happens if the subquery contains a SORT BY, > CLUSTER BY, DISTRIBUTE BY or LIMIT clause. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (HIVE-27649) Subqueries with a set operator do not support order by clauses
[ https://issues.apache.org/jira/browse/HIVE-27649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758701#comment-17758701 ] Nicolas Richard edited comment on HIVE-27649 at 8/24/23 6:55 PM: - I did some investigation to figure out what was going on. In https://issues.apache.org/jira/browse/HIVE-21980, the grammar [changed a bit|https://github.com/apache/hive/commit/0f39030c3d33b11ae9c14ac81e047b44e8695371] from: {code:java} atomjoinSource @init { gParent.pushMsg("joinSource", state); } @after { gParent.popMsg(state); } : tableSource (lateralView^)* | virtualTableSource (lateralView^)* | (subQuerySource) => subQuerySource (lateralView^)* | partitionedTableFunction (lateralView^)* | LPAREN! joinSource RPAREN! ;{code} to {code:java} atomjoinSource @init { gParent.pushMsg("joinSource", state); } @after { gParent.popMsg(state); } : tableSource (lateralView^)* | virtualTableSource (lateralView^)* | (LPAREN (KW_WITH|KW_SELECT|KW_MAP|KW_REDUCE|KW_FROM)) => subQuerySource (lateralView^)* | (LPAREN LPAREN atomSelectStatement RPAREN setOperator ) => subQuerySource (lateralView^)* | partitionedTableFunction (lateralView^)* | LPAREN! joinSource RPAREN! ;{code} When the query is parsed, we end up in the subQuerySource rule because atomSelectStatement, by definition, cannot contain SORT BY, CLUSTER BY, DISTRIBUTE BY or LIMIT clauses. An exception is thrown because subQuerySource requires an identifier which is not present nor needed in this particular scenario. I tested it locally and changing _atomSelectStatement_ to _selectStatement_ solves the issue. However, I still need to validate that it does not have side-effects by running the whole test suite. was (Author: JIRAUSER298135): I did some investigation to figure out what was going on. In https://issues.apache.org/jira/browse/HIVE-21980, the grammar [changed a bit|https://github.com/apache/hive/commit/0f39030c3d33b11ae9c14ac81e047b44e8695371] from: {code:java} atomjoinSource @init { gParent.pushMsg("joinSource", state); } @after { gParent.popMsg(state); } : tableSource (lateralView^)* | virtualTableSource (lateralView^)* | (subQuerySource) => subQuerySource (lateralView^)* | partitionedTableFunction (lateralView^)* | LPAREN! joinSource RPAREN! ;{code} to {code:java} atomjoinSource @init { gParent.pushMsg("joinSource", state); } @after { gParent.popMsg(state); } : tableSource (lateralView^)* | virtualTableSource (lateralView^)* | (LPAREN (KW_WITH|KW_SELECT|KW_MAP|KW_REDUCE|KW_FROM)) => subQuerySource (lateralView^)* | (LPAREN LPAREN atomSelectStatement RPAREN setOperator ) => subQuerySource (lateralView^)* | partitionedTableFunction (lateralView^)* | LPAREN! joinSource RPAREN! ;{code} When the query is parsed, we end up in the subQuerySource rule because atomSelectStatement, by definition, cannot contain SORT BY, CLUSTER BY, DISTRIBUTE BY or LIMIT clauses. An exception is thrown because subQuerySource requires an identifier which is not present nor needed in this particular scenario. I tested it locally and changing _atomSelectStatement_ to _selectStatement_ solves the issue. However, I still need to validate that it does not have side-effects by running the whole test suite. > Subqueries with a set operator do not support order by clauses > -- > > Key: HIVE-27649 > URL: https://issues.apache.org/jira/browse/HIVE-27649 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 3.1.2, 4.0.0 >Reporter: Nicolas Richard >Priority: Major > > Consider the following query: > {code:java} > select key from ((select key from src order by key) union (select key from > src))subq {code} > Up until 3.1.2, Hive would parse this query without any problems. However, if > you try it on the latest versions, you'll get the following exception: > {code:java} > org.apache.hadoop.hive.ql.parse.ParseException: line 1:60 cannot recognize > input near 'union' '(' 'select' in subquery source > at > org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:125) > at > org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:97) {code} > With the inner exception stack trace being: > {code:java} > NoViableAltException(367@[]) > at > org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.identifier(HiveParser_IdentifiersParser.java:14006) > at > org.apache.hadoop.hive.ql.parse.HiveParser.identifier(HiveParser.java:45086) > at > org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.subQuerySource(HiveParser_FromClauseParser.java:5411) > at >
[jira] [Commented] (HIVE-27649) Subqueries with a set operator do not support order by clauses
[ https://issues.apache.org/jira/browse/HIVE-27649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758701#comment-17758701 ] Nicolas Richard commented on HIVE-27649: I did some investigation to figure out what was going on. In https://issues.apache.org/jira/browse/HIVE-21980, the grammar [changed a bit|https://github.com/apache/hive/commit/0f39030c3d33b11ae9c14ac81e047b44e8695371] from: {code:java} atomjoinSource @init { gParent.pushMsg("joinSource", state); } @after { gParent.popMsg(state); } : tableSource (lateralView^)* | virtualTableSource (lateralView^)* | (subQuerySource) => subQuerySource (lateralView^)* | partitionedTableFunction (lateralView^)* | LPAREN! joinSource RPAREN! ;{code} to {code:java} atomjoinSource @init { gParent.pushMsg("joinSource", state); } @after { gParent.popMsg(state); } : tableSource (lateralView^)* | virtualTableSource (lateralView^)* | (LPAREN (KW_WITH|KW_SELECT|KW_MAP|KW_REDUCE|KW_FROM)) => subQuerySource (lateralView^)* | (LPAREN LPAREN atomSelectStatement RPAREN setOperator ) => subQuerySource (lateralView^)* | partitionedTableFunction (lateralView^)* | LPAREN! joinSource RPAREN! ;{code} When the query is parsed, we end up in the subQuerySource rule because atomSelectStatement, by definition, cannot contain SORT BY, CLUSTER BY, DISTRIBUTE BY or LIMIT clauses. An exception is thrown because subQuerySource requires an identifier which is not present nor needed in this particular scenario. I tested it locally and changing _atomSelectStatement_ to _selectStatement_ solves the issue. However, I still need to validate that it does not have side-effects by running the whole test suite. > Subqueries with a set operator do not support order by clauses > -- > > Key: HIVE-27649 > URL: https://issues.apache.org/jira/browse/HIVE-27649 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 3.1.2, 4.0.0 >Reporter: Nicolas Richard >Priority: Major > > Consider the following query: > {code:java} > select key from ((select key from src order by key) union (select key from > src))subq {code} > Up until 3.1.2, Hive would parse this query without any problems. However, if > you try it on the latest versions, you'll get the following exception: > {code:java} > org.apache.hadoop.hive.ql.parse.ParseException: line 1:60 cannot recognize > input near 'union' '(' 'select' in subquery source > at > org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:125) > at > org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:97) {code} > With the inner exception stack trace being: > {code:java} > NoViableAltException(367@[]) > at > org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.identifier(HiveParser_IdentifiersParser.java:14006) > at > org.apache.hadoop.hive.ql.parse.HiveParser.identifier(HiveParser.java:45086) > at > org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.subQuerySource(HiveParser_FromClauseParser.java:5411) > at > org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:1921) > at > org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:2175) > at > org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:2110) > at > org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:2175) > at > org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromSource(HiveParser_FromClauseParser.java:1750) > at > org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromClause(HiveParser_FromClauseParser.java:1593) > at > org.apache.hadoop.hive.ql.parse.HiveParser.fromClause(HiveParser.java:45094) > at > org.apache.hadoop.hive.ql.parse.HiveParser.atomSelectStatement(HiveParser.java:38538) > at > org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:38831) > at > org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:38424) > at > org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:37686) > at > org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:37574) > at > org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:2757) > at > org.apache.hadoop.hive.ql.parse.HiveParser.explainStatement(HiveParser.java:1751) > at > org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1614) > at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:123) >
[jira] [Updated] (HIVE-27649) Subqueries with a set operator do not support order by clauses
[ https://issues.apache.org/jira/browse/HIVE-27649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Richard updated HIVE-27649: --- Description: Consider the following query: {code:java} select key from ((select key from src order by key) union (select key from src))subq {code} Up until 3.1.2, Hive would parse this query without any problems. However, if you try it on the latest versions, you'll get the following exception: {code:java} org.apache.hadoop.hive.ql.parse.ParseException: line 1:60 cannot recognize input near 'union' '(' 'select' in subquery source at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:125) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:97) {code} With the inner exception stack trace being: {code:java} NoViableAltException(367@[]) at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.identifier(HiveParser_IdentifiersParser.java:14006) at org.apache.hadoop.hive.ql.parse.HiveParser.identifier(HiveParser.java:45086) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.subQuerySource(HiveParser_FromClauseParser.java:5411) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:1921) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:2175) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:2110) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:2175) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromSource(HiveParser_FromClauseParser.java:1750) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromClause(HiveParser_FromClauseParser.java:1593) at org.apache.hadoop.hive.ql.parse.HiveParser.fromClause(HiveParser.java:45094) at org.apache.hadoop.hive.ql.parse.HiveParser.atomSelectStatement(HiveParser.java:38538) at org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:38831) at org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:38424) at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:37686) at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:37574) at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:2757) at org.apache.hadoop.hive.ql.parse.HiveParser.explainStatement(HiveParser.java:1751) at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1614) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:123) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:97) {code} Note that this behavior also happens if the subquery contains a SORT BY, CLUSTER BY, DISTRIBUTE BY or LIMIT clause. was: Consider the following query: {code:java} select key from ((select key from src order by key) union (select key from src))subq {code} Up until 3.1.2, Hive would parse this query without any problems. However, if you try it on the latest versions, you'll get the following exception: {code:java} org.apache.hadoop.hive.ql.parse.ParseException: line 1:60 cannot recognize input near 'union' '(' 'select' in subquery source at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:125) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:97) {code} With the inner exception stack trace being: {code:java} NoViableAltException(367@[]) at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.identifier(HiveParser_IdentifiersParser.java:14006) at org.apache.hadoop.hive.ql.parse.HiveParser.identifier(HiveParser.java:45086) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.subQuerySource(HiveParser_FromClauseParser.java:5411) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:1921) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:2175) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:2110) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:2175) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromSource(HiveParser_FromClauseParser.java:1750) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromClause(HiveParser_FromClauseParser.java:1593) at org.apache.hadoop.hive.ql.parse.HiveParser.fromClause(HiveParser.java:45094) at org.apache.hadoop.hive.ql.parse.HiveParser.atomSelectStatement(HiveParser.java:38538) at
[jira] [Updated] (HIVE-27649) Subqueries with a set operator do not support order by statements
[ https://issues.apache.org/jira/browse/HIVE-27649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Richard updated HIVE-27649: --- Description: Consider the following query: {code:java} select key from ((select key from src order by key) union (select key from src))subq {code} Up until 3.1.2, Hive would parse this query without any problems. However, if you try it on the latest versions, you'll get the following exception: {code:java} org.apache.hadoop.hive.ql.parse.ParseException: line 1:60 cannot recognize input near 'union' '(' 'select' in subquery source at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:125) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:97) {code} With the inner exception stack trace being: {code:java} NoViableAltException(367@[]) at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.identifier(HiveParser_IdentifiersParser.java:14006) at org.apache.hadoop.hive.ql.parse.HiveParser.identifier(HiveParser.java:45086) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.subQuerySource(HiveParser_FromClauseParser.java:5411) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:1921) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:2175) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:2110) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:2175) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromSource(HiveParser_FromClauseParser.java:1750) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromClause(HiveParser_FromClauseParser.java:1593) at org.apache.hadoop.hive.ql.parse.HiveParser.fromClause(HiveParser.java:45094) at org.apache.hadoop.hive.ql.parse.HiveParser.atomSelectStatement(HiveParser.java:38538) at org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:38831) at org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:38424) at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:37686) at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:37574) at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:2757) at org.apache.hadoop.hive.ql.parse.HiveParser.explainStatement(HiveParser.java:1751) at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1614) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:123) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:97) {code} Note that this behavior also happens for ORDER BY, SORT BY, CLUSTER BY, DISTRIBUTE BY or LIMIT was: Consider the following query: {code:java} select key from ((select key from src order by key) union (select key from src))subq {code} Up until 3.1.2, Hive would parse this query without any problems. However, if you try it on the latest versions, you'll get the following exception: {code:java} org.apache.hadoop.hive.ql.parse.ParseException: line 1:60 cannot recognize input near 'union' '(' 'select' in subquery source at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:125) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:97) {code} With the inner exception stack trace being: {code:java} NoViableAltException(367@[]) at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.identifier(HiveParser_IdentifiersParser.java:14006) at org.apache.hadoop.hive.ql.parse.HiveParser.identifier(HiveParser.java:45086) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.subQuerySource(HiveParser_FromClauseParser.java:5411) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:1921) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:2175) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:2110) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:2175) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromSource(HiveParser_FromClauseParser.java:1750) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromClause(HiveParser_FromClauseParser.java:1593) at org.apache.hadoop.hive.ql.parse.HiveParser.fromClause(HiveParser.java:45094) at org.apache.hadoop.hive.ql.parse.HiveParser.atomSelectStatement(HiveParser.java:38538) at
[jira] [Updated] (HIVE-27649) Subqueries with a set operator do not support order by clauses
[ https://issues.apache.org/jira/browse/HIVE-27649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Richard updated HIVE-27649: --- Summary: Subqueries with a set operator do not support order by clauses (was: Subqueries with a set operator do not support order by statements) > Subqueries with a set operator do not support order by clauses > -- > > Key: HIVE-27649 > URL: https://issues.apache.org/jira/browse/HIVE-27649 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 3.1.2, 4.0.0 >Reporter: Nicolas Richard >Priority: Major > > Consider the following query: > {code:java} > select key from ((select key from src order by key) union (select key from > src))subq {code} > Up until 3.1.2, Hive would parse this query without any problems. However, if > you try it on the latest versions, you'll get the following exception: > {code:java} > org.apache.hadoop.hive.ql.parse.ParseException: line 1:60 cannot recognize > input near 'union' '(' 'select' in subquery source > at > org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:125) > at > org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:97) {code} > With the inner exception stack trace being: > {code:java} > NoViableAltException(367@[]) > at > org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.identifier(HiveParser_IdentifiersParser.java:14006) > at > org.apache.hadoop.hive.ql.parse.HiveParser.identifier(HiveParser.java:45086) > at > org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.subQuerySource(HiveParser_FromClauseParser.java:5411) > at > org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:1921) > at > org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:2175) > at > org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:2110) > at > org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:2175) > at > org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromSource(HiveParser_FromClauseParser.java:1750) > at > org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromClause(HiveParser_FromClauseParser.java:1593) > at > org.apache.hadoop.hive.ql.parse.HiveParser.fromClause(HiveParser.java:45094) > at > org.apache.hadoop.hive.ql.parse.HiveParser.atomSelectStatement(HiveParser.java:38538) > at > org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:38831) > at > org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:38424) > at > org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:37686) > at > org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:37574) > at > org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:2757) > at > org.apache.hadoop.hive.ql.parse.HiveParser.explainStatement(HiveParser.java:1751) > at > org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1614) > at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:123) > at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:97) > {code} > Note that this behavior also happens for ORDER BY, SORT BY, CLUSTER BY, > DISTRIBUTE BY or LIMIT > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-27649) Subqueries with a set operator do not support order by statements
Nicolas Richard created HIVE-27649: -- Summary: Subqueries with a set operator do not support order by statements Key: HIVE-27649 URL: https://issues.apache.org/jira/browse/HIVE-27649 Project: Hive Issue Type: Bug Components: Parser Affects Versions: 3.1.2, 4.0.0 Reporter: Nicolas Richard Consider the following query: {code:java} select key from ((select key from src order by key) union (select key from src))subq {code} Up until 3.1.2, Hive would parse this query without any problems. However, if you try it on the latest versions, you'll get the following exception: {code:java} org.apache.hadoop.hive.ql.parse.ParseException: line 1:60 cannot recognize input near 'union' '(' 'select' in subquery source at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:125) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:97) {code} With the inner exception stack trace being: {code:java} NoViableAltException(367@[]) at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.identifier(HiveParser_IdentifiersParser.java:14006) at org.apache.hadoop.hive.ql.parse.HiveParser.identifier(HiveParser.java:45086) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.subQuerySource(HiveParser_FromClauseParser.java:5411) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:1921) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:2175) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:2110) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:2175) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromSource(HiveParser_FromClauseParser.java:1750) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromClause(HiveParser_FromClauseParser.java:1593) at org.apache.hadoop.hive.ql.parse.HiveParser.fromClause(HiveParser.java:45094) at org.apache.hadoop.hive.ql.parse.HiveParser.atomSelectStatement(HiveParser.java:38538) at org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:38831) at org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:38424) at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:37686) at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:37574) at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:2757) at org.apache.hadoop.hive.ql.parse.HiveParser.explainStatement(HiveParser.java:1751) at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1614) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:123) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:97) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-27648) CREATE TABLE with CHECK constraint fails with SemanticException
Soumyakanti Das created HIVE-27648: -- Summary: CREATE TABLE with CHECK constraint fails with SemanticException Key: HIVE-27648 URL: https://issues.apache.org/jira/browse/HIVE-27648 Project: Hive Issue Type: Bug Components: Hive Reporter: Soumyakanti Das When we run: {code:java} create table test ( col1 int, `col 2` int check (`col 2` > 10) enable novalidate rely, constraint check_constraint check (col1 + `col 2` > 15) enable novalidate rely ); {code} It fails with: {code:java} org.apache.hadoop.hive.ql.parse.SemanticException: Invalid Constraint syntax Invalid CHECK constraint expression: col 2 > 10. at org.apache.hadoop.hive.ql.ddl.table.constraint.ConstraintsUtils.validateCheckConstraint(ConstraintsUtils.java:462) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:13839) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genResolvedParseTree(SemanticAnalyzer.java:12618) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12787) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:467) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327) at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:224) at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:107) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:519) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:471) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:436) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:430) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:121) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:227) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:257) at org.apache.hadoop.hive.cli.CliDriver.processCmd1(CliDriver.java:201) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:127) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:425) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:356) at org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:733) at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:703) at org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:115) at org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:157) at org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver(TestMiniLlapLocalCliDriver.java:62) {code} I noticed while debugging that the check constraint expression in [cc.getCheck_expression()|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/ddl/table/constraint/ConstraintsUtils.java#L446] doesn't include the backticks (`), and this results in wrong token generation. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work started] (HIVE-27640) Counter for query concurrency
[ https://issues.apache.org/jira/browse/HIVE-27640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-27640 started by László Bodor. --- > Counter for query concurrency > - > > Key: HIVE-27640 > URL: https://issues.apache.org/jira/browse/HIVE-27640 > Project: Hive > Issue Type: Sub-task >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > > This is kind of hard to catch easily, but I would like to see > something/anything about query concurrency in the query counters. This way we > can instantly see in the query summary what happened. I mean counters like: > 1. how many queries were running when this query arrived > 2. same as 1) but in query stage level > 2a) how many queries were being compiled (or waiting for compilation) when > this query started to compile (or started to enqueued for compilation) > 2b) how many queries were waiting for a coordinator when this query started > to get a coordinator > 2c) how many queries were in the Run DAG phase, when this query started to > run DAG -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27647) NullPointerException from LowLevelCacheImpl
[ https://issues.apache.org/jira/browse/HIVE-27647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kwangwon (Trey) Yi updated HIVE-27647: -- Description: Hi all, I've executed Hive using Tez engine, and got the below NPE. It seems like there was a NullPointerException error from LLAP LowLevelCacheImpl class when `putFileData` was called. What are some possible reasons for this error and suggestions to mitigate it? I wasn't able to find the exact same issue from the past Jira tickets. It would be much appreciated if you could provide me some sort of advice. Thank you. {quote}2023-08-24 10:22:31,122 [INFO] [Dispatcher thread Unknown macro: Unknown macro: Unknown macro: \{Central}] |HistoryEventHandler.criticalEvents|: [HISTORY][DAG:dag_1691565142260_0112_1][Event:TASK_ATTEMPT_FINISHED]: vertexName=Map 4, taskAttemptId=attempt_1691565142260_0112_1_00_10_2, creationTime=1691962230725, allocationTime=1691962230727, startTime=1691962230728, finishTime=1691962230991, timeTaken=263, status=FAILED, taskFailureType=NON_FATAL, errorEnum=FRAMEWORK_ERROR, diagnostics=Error: Error while running task ( failure ) : attempt_1691565142260_0112_1_00_10_2:java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:303) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:254) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:80) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:272) ... 15 more Caused by: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:376) at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:82) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:119) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:59) at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151) at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68) ... 17 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.llap.cache.LowLevelCacheImpl.putFileData(LowLevelCacheImpl.java:300) at org.apache.hadoop.hive.llap.io.api.impl.LlapIoImpl$GenericDataCache.putFileData(LlapIoImpl.java:303) at org.apache.hadoop.hive.llap.LlapCacheAwareFs$CacheAwareInputStream.read(LlapCacheAwareFs.java:324) at org.apache.commons.io.IOUtils.read(IOUtils.java:1542) at org.apache.commons.io.IOUtils.readFully(IOUtils.java:1658) at org.apache.hadoop.util.ByteBufferIOUtils.readFullyHeapBuffer(ByteBufferIOUtils.java:89) at org.apache.hadoop.util.ByteBufferIOUtils.readFully(ByteBufferIOUtils.java:53) at
[jira] [Updated] (HIVE-27647) NullPointerException from LowLevelCacheImpl
[ https://issues.apache.org/jira/browse/HIVE-27647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kwangwon (Trey) Yi updated HIVE-27647: -- Description: Hi all, I've executed Hive using Tez engine, and got the below NPE. It seems like there was a NullPointerException error from LLAP LowLevelCacheImpl class when `putFileData` was called. What are some possible reasons for this error and suggestions to mitigate it? I wasn't able to find the exact same issue from the past Jira tickets. It would be much appreciated if you could provide me some sort of advice. Thank you. {quote}2023-08-24 10:22:31,122 [INFO] [Dispatcher thread Unknown macro: Unknown macro: Unknown macro: Unknown macro: \{Central}] |HistoryEventHandler.criticalEvents|: [HISTORY][DAG:dag_1691565142260_0112_1][Event:TASK_ATTEMPT_FINISHED]: vertexName=Map 4, taskAttemptId=attempt_1691565142260_0112_1_00_10_2, creationTime=1691962230725, allocationTime=1691962230727, startTime=1691962230728, finishTime=1691962230991, timeTaken=263, status=FAILED, taskFailureType=NON_FATAL, errorEnum=FRAMEWORK_ERROR, diagnostics=Error: Error while running task ( failure ) : attempt_1691565142260_0112_1_00_10_2:java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:303) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:254) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:80) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:272) ... 15 more Caused by: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:376) at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:82) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:119) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:59) at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151) at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68) ... 17 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.llap.cache.LowLevelCacheImpl.putFileData(LowLevelCacheImpl.java:300) at org.apache.hadoop.hive.llap.io.api.impl.LlapIoImpl$GenericDataCache.putFileData(LlapIoImpl.java:303) at org.apache.hadoop.hive.llap.LlapCacheAwareFs$CacheAwareInputStream.read(LlapCacheAwareFs.java:324) at org.apache.commons.io.IOUtils.read(IOUtils.java:1542) at org.apache.commons.io.IOUtils.readFully(IOUtils.java:1658) at org.apache.hadoop.util.ByteBufferIOUtils.readFullyHeapBuffer(ByteBufferIOUtils.java:89) at org.apache.hadoop.util.ByteBufferIOUtils.readFully(ByteBufferIOUtils.java:53) at
[jira] [Updated] (HIVE-27647) NullPointerException from LowLevelCacheImpl
[ https://issues.apache.org/jira/browse/HIVE-27647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kwangwon (Trey) Yi updated HIVE-27647: -- Description: Hi all, I've executed Hive using Tez engine, and got the below NPE. It seems like there was a NullPointerException error from LLAP LowLevelCacheImpl class when `putFileData` was called. What are some possible reasons for this error and suggestions to mitigate it? I wasn't able to find the exact same issue from the past Jira tickets. It would be much appreciated if you could provide me some sort of advice. Thank you. {quote}2023-08-24 10:22:31,122 [INFO] [Dispatcher thread Unknown macro: Unknown macro: Unknown macro: \\{Central}] |HistoryEventHandler.criticalEvents|: [HISTORY][DAG:dag_1691565142260_0112_1][Event:TASK_ATTEMPT_FINISHED]: vertexName=Map 4, taskAttemptId=attempt_1691565142260_0112_1_00_10_2, creationTime=1691962230725, allocationTime=1691962230727, startTime=1691962230728, finishTime=1691962230991, timeTaken=263, status=FAILED, taskFailureType=NON_FATAL, errorEnum=FRAMEWORK_ERROR, diagnostics=Error: Error while running task ( failure ) : attempt_1691565142260_0112_1_00_10_2:java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:303) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:254) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:80) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:272) ... 15 more Caused by: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:376) at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:82) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:119) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:59) at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151) at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68) ... 17 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.llap.cache.LowLevelCacheImpl.putFileData(LowLevelCacheImpl.java:300) at org.apache.hadoop.hive.llap.io.api.impl.LlapIoImpl$GenericDataCache.putFileData(LlapIoImpl.java:303) at org.apache.hadoop.hive.llap.LlapCacheAwareFs$CacheAwareInputStream.read(LlapCacheAwareFs.java:324) at org.apache.commons.io.IOUtils.read(IOUtils.java:1542) at org.apache.commons.io.IOUtils.readFully(IOUtils.java:1658) at org.apache.hadoop.util.ByteBufferIOUtils.readFullyHeapBuffer(ByteBufferIOUtils.java:89) at org.apache.hadoop.util.ByteBufferIOUtils.readFully(ByteBufferIOUtils.java:53) at
[jira] [Updated] (HIVE-27647) NullPointerException from LowLevelCacheImpl
[ https://issues.apache.org/jira/browse/HIVE-27647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kwangwon (Trey) Yi updated HIVE-27647: -- Description: Hi all, I've executed Hive using Tez engine, and got the below NPE. It seems like there was a NullPointerException error from LLAP LowLevelCacheImpl class when `putFileData` was called. What are some possible reasons for this error and suggestions to mitigate it? I wasn't able to find the exact same issue from the past Jira tickets. It would be much appreciated if you could provide me some sort of advice. Thank you. {quote}2023-08-24 10:22:31,122 [INFO] [Dispatcher thread Unknown macro: Unknown macro: \\{Central}] |HistoryEventHandler.criticalEvents|: [HISTORY][DAG:dag_1691565142260_0112_1][Event:TASK_ATTEMPT_FINISHED]: vertexName=Map 4, taskAttemptId=attempt_1691565142260_0112_1_00_10_2, creationTime=1691962230725, allocationTime=1691962230727, startTime=1691962230728, finishTime=1691962230991, timeTaken=263, status=FAILED, taskFailureType=NON_FATAL, errorEnum=FRAMEWORK_ERROR, diagnostics=Error: Error while running task ( failure ) : attempt_1691565142260_0112_1_00_10_2:java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:303) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:254) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:80) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:272) ... 15 more Caused by: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:376) at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:82) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:119) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:59) at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151) at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68) ... 17 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.llap.cache.LowLevelCacheImpl.putFileData(LowLevelCacheImpl.java:300) at org.apache.hadoop.hive.llap.io.api.impl.LlapIoImpl$GenericDataCache.putFileData(LlapIoImpl.java:303) at org.apache.hadoop.hive.llap.LlapCacheAwareFs$CacheAwareInputStream.read(LlapCacheAwareFs.java:324) at org.apache.commons.io.IOUtils.read(IOUtils.java:1542) at org.apache.commons.io.IOUtils.readFully(IOUtils.java:1658) at org.apache.hadoop.util.ByteBufferIOUtils.readFullyHeapBuffer(ByteBufferIOUtils.java:89) at org.apache.hadoop.util.ByteBufferIOUtils.readFully(ByteBufferIOUtils.java:53) at
[jira] [Updated] (HIVE-27647) NullPointerException from LowLevelCacheImpl
[ https://issues.apache.org/jira/browse/HIVE-27647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kwangwon (Trey) Yi updated HIVE-27647: -- Description: Hi all, I've executed Hive using Tez engine, and got the below NPE. It seems like there was a NullPointerException error from LLAP LowLevelCacheImpl class when `putFileData` was called. What are some possible reasons for this error and suggestions to mitigate it? It would be much appreciated if you could provide me some sort of advice. Thank you. {quote}2023-08-24 10:22:31,122 [INFO] [Dispatcher thread Unknown macro: Unknown macro: \{Central}] |HistoryEventHandler.criticalEvents|: [HISTORY][DAG:dag_1691565142260_0112_1][Event:TASK_ATTEMPT_FINISHED]: vertexName=Map 4, taskAttemptId=attempt_1691565142260_0112_1_00_10_2, creationTime=1691962230725, allocationTime=1691962230727, startTime=1691962230728, finishTime=1691962230991, timeTaken=263, status=FAILED, taskFailureType=NON_FATAL, errorEnum=FRAMEWORK_ERROR, diagnostics=Error: Error while running task ( failure ) : attempt_1691565142260_0112_1_00_10_2:java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:303) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:254) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:80) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:272) ... 15 more Caused by: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:376) at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:82) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:119) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:59) at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151) at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68) ... 17 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.llap.cache.LowLevelCacheImpl.putFileData(LowLevelCacheImpl.java:300) at org.apache.hadoop.hive.llap.io.api.impl.LlapIoImpl$GenericDataCache.putFileData(LlapIoImpl.java:303) at org.apache.hadoop.hive.llap.LlapCacheAwareFs$CacheAwareInputStream.read(LlapCacheAwareFs.java:324) at org.apache.commons.io.IOUtils.read(IOUtils.java:1542) at org.apache.commons.io.IOUtils.readFully(IOUtils.java:1658) at org.apache.hadoop.util.ByteBufferIOUtils.readFullyHeapBuffer(ByteBufferIOUtils.java:89) at org.apache.hadoop.util.ByteBufferIOUtils.readFully(ByteBufferIOUtils.java:53) at org.apache.hadoop.fs.DefaultMultiByteBufferReader.readFullyIntoBuffers(DefaultMultiByteBufferReader.java:36) at
[jira] [Updated] (HIVE-27647) NullPointerException from LowLevelCacheImpl
[ https://issues.apache.org/jira/browse/HIVE-27647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kwangwon (Trey) Yi updated HIVE-27647: -- Description: Hi all, I've executed Hive using Tez engine, and got the below NPE. It seems like there was a NullPointerException error from LLAP LowLevelCacheImpl class when `putFileData` was called. What are some possible reasons for this error and suggestions to mitigate it? It would be much appreciated if you could provide me some sort of advice. Thank you. {quote}2023-08-24 10:22:31,122 [INFO] [Dispatcher thread Unknown macro: \\{Central}] |HistoryEventHandler.criticalEvents|: [HISTORY][DAG:dag_1691565142260_0112_1][Event:TASK_ATTEMPT_FINISHED]: vertexName=Map 4, taskAttemptId=attempt_1691565142260_0112_1_00_10_2, creationTime=1691962230725, allocationTime=1691962230727, startTime=1691962230728, finishTime=1691962230991, timeTaken=263, status=FAILED, taskFailureType=NON_FATAL, errorEnum=FRAMEWORK_ERROR, diagnostics=Error: Error while running task ( failure ) : attempt_1691565142260_0112_1_00_10_2:java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:303) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:254) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:80) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:272) ... 15 more Caused by: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:376) at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:82) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:119) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:59) at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151) at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68) ... 17 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.llap.cache.LowLevelCacheImpl.putFileData(LowLevelCacheImpl.java:300) at org.apache.hadoop.hive.llap.io.api.impl.LlapIoImpl$GenericDataCache.putFileData(LlapIoImpl.java:303) at org.apache.hadoop.hive.llap.LlapCacheAwareFs$CacheAwareInputStream.read(LlapCacheAwareFs.java:324) at org.apache.commons.io.IOUtils.read(IOUtils.java:1542) at org.apache.commons.io.IOUtils.readFully(IOUtils.java:1658) at org.apache.hadoop.util.ByteBufferIOUtils.readFullyHeapBuffer(ByteBufferIOUtils.java:89) at org.apache.hadoop.util.ByteBufferIOUtils.readFully(ByteBufferIOUtils.java:53) at org.apache.hadoop.fs.DefaultMultiByteBufferReader.readFullyIntoBuffers(DefaultMultiByteBufferReader.java:36) at
[jira] [Updated] (HIVE-27647) NullPointerException from LowLevelCacheImpl
[ https://issues.apache.org/jira/browse/HIVE-27647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kwangwon (Trey) Yi updated HIVE-27647: -- Description: Hi all, I've executed Hive using Tez engine, and got the below NPE. It seems like there was a NullPointerException error from LLAP LowLevelCacheImpl class when `putFileData` was called. What are some possible reasons for this error and suggestions to mitigate it? It would be much appreciated if you could provide me some sort of advice. Thank you. {color:#b5cea8}2023{color}{color:#d4d4d4}-08-{color}{color:#b5cea8}24{color}{color:#d4d4d4} {color}{color:#b5cea8}10{color}{color:#c586c0}:{color}{color:#b5cea8}22{color}{color:#c586c0}:{color}{color:#b5cea8}31{color}{color:#d4d4d4},{color}{color:#b5cea8}122{color}{color:#d4d4d4} [INFO] [{color}{color:#4ec9b0}Dispatcher{color}{color:#d4d4d4} thread {color}{color:#4ec9b0}Unknown{color}{color:#d4d4d4} {color}{color:#9cdcfe}macro{color}{color:#c586c0}:{color}{color:#d4d4d4} \\{Central}] |{color}{color:#9cdcfe}HistoryEventHandler{color}{color:#d4d4d4}.{color}{color:#9cdcfe}criticalEvents{color}{color:#d4d4d4}|{color}{color:#c586c0}:{color}{color:#d4d4d4} [HISTORY][DAG{color}{color:#c586c0}:{color}{color:#d4d4d4}dag_1691565142260_0112_1][Event{color}{color:#c586c0}:{color}{color:#d4d4d4}TASK_ATTEMPT_FINISHED]{color}{color:#c586c0}:{color}{color:#d4d4d4} vertexName=Map {color}{color:#b5cea8}4{color}{color:#d4d4d4}, taskAttemptId=attempt_1691565142260_0112_1_00_10_2, creationTime={color}{color:#b5cea8}1691962230725{color}{color:#d4d4d4}, allocationTime={color}{color:#b5cea8}1691962230727{color}{color:#d4d4d4}, startTime={color}{color:#b5cea8}1691962230728{color}{color:#d4d4d4}, finishTime={color}{color:#b5cea8}1691962230991{color}{color:#d4d4d4}, timeTaken={color}{color:#b5cea8}263{color}{color:#d4d4d4}, status=FAILED, taskFailureType=NON_FATAL, errorEnum=FRAMEWORK_ERROR, diagnostics=Error{color}{color:#c586c0}:{color}{color:#d4d4d4} {color}{color:#4ec9b0}Error{color}{color:#d4d4d4} {color}{color:#c586c0}while{color}{color:#d4d4d4} running {color}{color:#dcdcaa}task{color}{color:#d4d4d4} ( failure ) {color}{color:#c586c0}:{color}{color:#d4d4d4} attempt_1691565142260_0112_1_00_10_2{color}{color:#c586c0}:{color}{color:#9cdcfe}java{color}{color:#d4d4d4}.{color}{color:#9cdcfe}lang{color}{color:#d4d4d4}.{color}{color:#9cdcfe}RuntimeException{color}{color:#c586c0}:{color}{color:#d4d4d4} {color}{color:#9cdcfe}org{color}{color:#d4d4d4}.{color}{color:#9cdcfe}apache{color}{color:#d4d4d4}.{color}{color:#9cdcfe}hadoop{color}{color:#d4d4d4}.{color}{color:#9cdcfe}hive{color}{color:#d4d4d4}.{color}{color:#9cdcfe}ql{color}{color:#d4d4d4}.{color}{color:#9cdcfe}metadata{color}{color:#d4d4d4}.{color}{color:#9cdcfe}HiveException{color}{color:#c586c0}:{color}{color:#d4d4d4} {color}{color:#9cdcfe}java{color}{color:#d4d4d4}.{color}{color:#9cdcfe}io{color}{color:#d4d4d4}.{color}{color:#9cdcfe}IOException{color}{color:#c586c0}:{color}{color:#d4d4d4} {color}{color:#9cdcfe}java{color}{color:#d4d4d4}.{color}{color:#9cdcfe}lang{color}{color:#d4d4d4}.{color}{color:#9cdcfe}NullPointerException{color} {color:#d4d4d4}at {color}{color:#9cdcfe}org{color}{color:#d4d4d4}.{color}{color:#9cdcfe}apache{color}{color:#d4d4d4}.{color}{color:#9cdcfe}hadoop{color}{color:#d4d4d4}.{color}{color:#9cdcfe}hive{color}{color:#d4d4d4}.{color}{color:#9cdcfe}ql{color}{color:#d4d4d4}.{color}{color:#9cdcfe}exec{color}{color:#d4d4d4}.{color}{color:#9cdcfe}tez{color}{color:#d4d4d4}.{color}{color:#9cdcfe}TezProcessor{color}{color:#d4d4d4}.{color}{color:#dcdcaa}initializeAndRunProcessor{color}{color:#d4d4d4}({color}{color:#9cdcfe}TezProcessor{color}{color:#d4d4d4}.{color}{color:#9cdcfe}java{color}{color:#c586c0}:{color}{color:#b5cea8}303{color}{color:#d4d4d4}){color} {color:#d4d4d4} at {color}{color:#9cdcfe}org{color}{color:#d4d4d4}.{color}{color:#9cdcfe}apache{color}{color:#d4d4d4}.{color}{color:#9cdcfe}hadoop{color}{color:#d4d4d4}.{color}{color:#9cdcfe}hive{color}{color:#d4d4d4}.{color}{color:#9cdcfe}ql{color}{color:#d4d4d4}.{color}{color:#9cdcfe}exec{color}{color:#d4d4d4}.{color}{color:#9cdcfe}tez{color}{color:#d4d4d4}.{color}{color:#9cdcfe}TezProcessor{color}{color:#d4d4d4}.{color}{color:#dcdcaa}run{color}{color:#d4d4d4}({color}{color:#9cdcfe}TezProcessor{color}{color:#d4d4d4}.{color}{color:#9cdcfe}java{color}{color:#c586c0}:{color}{color:#b5cea8}254{color}{color:#d4d4d4}){color} {color:#d4d4d4} at
[jira] [Updated] (HIVE-27647) NullPointerException from LowLevelCacheImpl
[ https://issues.apache.org/jira/browse/HIVE-27647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kwangwon (Trey) Yi updated HIVE-27647: -- Description: Hi all, I've executed Hive using Tez engine, and got the below NPE. It seems like there was a NullPointerException error from LLAP LowLevelCacheImpl class when `putFileData` was called. What are some possible reasons for this error and suggestions to mitigate it? It would be much appreciated if you could provide me some sort of advice. Thank you. {quote}2023-08-24 10:22:31,122 [INFO] [Dispatcher thread Unknown macro: \{Central}] |HistoryEventHandler.criticalEvents|: [HISTORY][DAG:dag_1691565142260_0112_1][Event:TASK_ATTEMPT_FINISHED]: vertexName=Map 4, taskAttemptId=attempt_1691565142260_0112_1_00_10_2, creationTime=1691962230725, allocationTime=1691962230727, startTime=1691962230728, finishTime=1691962230991, timeTaken=263, status=FAILED, taskFailureType=NON_FATAL, errorEnum=FRAMEWORK_ERROR, diagnostics=Error: Error while running task ( failure ) : attempt_1691565142260_0112_1_00_10_2:java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:303) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:254) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:80) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:272) ... 15 more Caused by: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:376) at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:82) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:119) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:59) at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151) at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68) ... 17 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.llap.cache.LowLevelCacheImpl.putFileData(LowLevelCacheImpl.java:300) at org.apache.hadoop.hive.llap.io.api.impl.LlapIoImpl$GenericDataCache.putFileData(LlapIoImpl.java:303) at org.apache.hadoop.hive.llap.LlapCacheAwareFs$CacheAwareInputStream.read(LlapCacheAwareFs.java:324) at org.apache.commons.io.IOUtils.read(IOUtils.java:1542) at org.apache.commons.io.IOUtils.readFully(IOUtils.java:1658) at org.apache.hadoop.util.ByteBufferIOUtils.readFullyHeapBuffer(ByteBufferIOUtils.java:89) at org.apache.hadoop.util.ByteBufferIOUtils.readFully(ByteBufferIOUtils.java:53) at org.apache.hadoop.fs.DefaultMultiByteBufferReader.readFullyIntoBuffers(DefaultMultiByteBufferReader.java:36) at org.apache.hadoop.fs.FSDataInputStream.readFullyIntoBuffers(FSDataInputStream.java:264) at
[jira] [Updated] (HIVE-27647) NullPointerException from LowLevelCacheImpl
[ https://issues.apache.org/jira/browse/HIVE-27647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kwangwon (Trey) Yi updated HIVE-27647: -- Description: Hi all, I've executed Hive using Tez engine, and got the below NPE. It seems like there was a NullPointerException error from LLAP LowLevelCacheImpl class when `putFileData` was called. What are some possible reasons for this error and suggestions to mitigate it? It would be much appreciated if you could provide me some sort of advice. Thank you. {quote}2023-08-24 10:22:31,122 [INFO] [Dispatcher thread \\{Central}] |HistoryEventHandler.criticalEvents|: [HISTORY][DAG:dag_1691565142260_0112_1][Event:TASK_ATTEMPT_FINISHED]: vertexName=Map 4, taskAttemptId=attempt_1691565142260_0112_1_00_10_2, creationTime=1691962230725, allocationTime=1691962230727, startTime=1691962230728, finishTime=1691962230991, timeTaken=263, status=FAILED, taskFailureType=NON_FATAL, errorEnum=FRAMEWORK_ERROR, diagnostics=Error: Error while running task ( failure ) : attempt_1691565142260_0112_1_00_10_2:java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:303) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:254) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:80) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:272) ... 15 more Caused by: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:376) at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:82) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:119) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:59) at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151) at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68) ... 17 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.llap.cache.LowLevelCacheImpl.putFileData(LowLevelCacheImpl.java:300) at org.apache.hadoop.hive.llap.io.api.impl.LlapIoImpl$GenericDataCache.putFileData(LlapIoImpl.java:303) at org.apache.hadoop.hive.llap.LlapCacheAwareFs$CacheAwareInputStream.read(LlapCacheAwareFs.java:324) at org.apache.commons.io.IOUtils.read(IOUtils.java:1542) at org.apache.commons.io.IOUtils.readFully(IOUtils.java:1658) at org.apache.hadoop.util.ByteBufferIOUtils.readFullyHeapBuffer(ByteBufferIOUtils.java:89) at org.apache.hadoop.util.ByteBufferIOUtils.readFully(ByteBufferIOUtils.java:53) at org.apache.hadoop.fs.DefaultMultiByteBufferReader.readFullyIntoBuffers(DefaultMultiByteBufferReader.java:36) at org.apache.hadoop.fs.FSDataInputStream.readFullyIntoBuffers(FSDataInputStream.java:264) at
[jira] [Created] (HIVE-27647) NullPointerException from LowLevelCacheImpl
Kwangwon (Trey) Yi created HIVE-27647: - Summary: NullPointerException from LowLevelCacheImpl Key: HIVE-27647 URL: https://issues.apache.org/jira/browse/HIVE-27647 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 3.1.3 Environment: Hive: 3.1.3 Tez: 0.9.2 Reporter: Kwangwon (Trey) Yi Hi all, I've executed Hive using Tez engine, and got the below NPE. It seems like there was a NullPointerException error from LLAP LowLevelCacheImpl class when `putFileData` was called. What are some possible reasons for this error and suggestions to mitigate it? It would be much appreciated if you could provide me some sort of advice. Thank you. ``` 2023-08-24 10:22:31,122 [INFO] [Dispatcher thread \{Central}] |HistoryEventHandler.criticalEvents|: [HISTORY][DAG:dag_1691565142260_0112_1][Event:TASK_ATTEMPT_FINISHED]: vertexName=Map 4, taskAttemptId=attempt_1691565142260_0112_1_00_10_2, creationTime=1691962230725, allocationTime=1691962230727, startTime=1691962230728, finishTime=1691962230991, timeTaken=263, status=FAILED, taskFailureType=NON_FATAL, errorEnum=FRAMEWORK_ERROR, diagnostics=Error: Error while running task ( failure ) : attempt_1691565142260_0112_1_00_10_2:java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:303) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:254) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:80) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:272) ... 15 more Caused by: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:376) at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:82) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:119) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:59) at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151) at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68) ... 17 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.llap.cache.LowLevelCacheImpl.putFileData(LowLevelCacheImpl.java:300) at org.apache.hadoop.hive.llap.io.api.impl.LlapIoImpl$GenericDataCache.putFileData(LlapIoImpl.java:303) at org.apache.hadoop.hive.llap.LlapCacheAwareFs$CacheAwareInputStream.read(LlapCacheAwareFs.java:324) at org.apache.commons.io.IOUtils.read(IOUtils.java:1542) at org.apache.commons.io.IOUtils.readFully(IOUtils.java:1658) at org.apache.hadoop.util.ByteBufferIOUtils.readFullyHeapBuffer(ByteBufferIOUtils.java:89) at org.apache.hadoop.util.ByteBufferIOUtils.readFully(ByteBufferIOUtils.java:53) at
[jira] [Resolved] (HIVE-27595) Improve efficiency in the filtering hooks
[ https://issues.apache.org/jira/browse/HIVE-27595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naveen Gangam resolved HIVE-27595. -- Fix Version/s: 4.0.0 Resolution: Fixed Fix has been merged to master. Thank you for the patch [~henrib] and reviews [~hemanth619] and [~jfs] > Improve efficiency in the filtering hooks > - > > Key: HIVE-27595 > URL: https://issues.apache.org/jira/browse/HIVE-27595 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 4.0.0-alpha-2 >Reporter: Naveen Gangam >Assignee: Henri Biestro >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > > https://github.com/apache/hive/blob/a406d6d4417277e45b93f1733bed5201afdee29b/ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/metastore/HiveMetaStoreAuthorizer.java#L353-L377 > In case where the tableList has large amounts of tables (tested with 200k in > my case), the hivePrivilegedObjects could just as big. So both these lists > are 200k. > Essentially. the code is trying to return a subset of tableList collection > that matches the objects returned in hivePrivilegedObjects. This results in a > N*N iteration that causes bad performance. (in my case, the HMS client > timeout expired and show tables failed). > This code needs to be optimized for performance. > we have a similar problem in this code as well. > ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/AuthorizationMetaStoreFilterHook.java -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27646) Iceberg: Retry query when concurrent write queries fail due to conflicting writes
[ https://issues.apache.org/jira/browse/HIVE-27646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simhadri Govindappa updated HIVE-27646: --- Summary: Iceberg: Retry query when concurrent write queries fail due to conflicting writes (was: Iceberg: Retry query when concurrent write queries fail due to conflicting write) > Iceberg: Retry query when concurrent write queries fail due to conflicting > writes > - > > Key: HIVE-27646 > URL: https://issues.apache.org/jira/browse/HIVE-27646 > Project: Hive > Issue Type: Improvement >Reporter: Simhadri Govindappa >Assignee: Simhadri Govindappa >Priority: Major > > Assume two concurrent update queries- Query A and Query B , that have > overlapping updates. > If Query A commits the data and delete files first, then Query B will fail > with validation failure due to conflicting writes. > In this case, Query B should invalidate the commit files that are already > generated and re-execute the full query on the latest snapshot. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27646) Iceberg: Retry query when concurrent write queries fail due to conflicting write
[ https://issues.apache.org/jira/browse/HIVE-27646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simhadri Govindappa updated HIVE-27646: --- Description: Assume two concurrent update queries- Query A and Query B , that have overlapping updates. If Query A commits the data and delete files first, then Query B will fail with validation failure due to conflicting writes. In this case, Query B should invalidate the commit files that are already generated and re-execute the full query on the latest snapshot. was: During concurrent updates, Assume 2 concurrent update queries- Query A and Query B that have insersecting updates If Query A commits the data and delet If any conflicting files are detected during the commit stage of the query that commits last, we will have to re-execute the full query. > Iceberg: Retry query when concurrent write queries fail due to conflicting > write > > > Key: HIVE-27646 > URL: https://issues.apache.org/jira/browse/HIVE-27646 > Project: Hive > Issue Type: Improvement >Reporter: Simhadri Govindappa >Assignee: Simhadri Govindappa >Priority: Major > > Assume two concurrent update queries- Query A and Query B , that have > overlapping updates. > If Query A commits the data and delete files first, then Query B will fail > with validation failure due to conflicting writes. > In this case, Query B should invalidate the commit files that are already > generated and re-execute the full query on the latest snapshot. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27646) Iceberg: Retry query when concurrent write queries fail due to conflicting write
[ https://issues.apache.org/jira/browse/HIVE-27646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simhadri Govindappa updated HIVE-27646: --- Description: During concurrent updates, Assume 2 concurrent update queries- Query A and Query B that have insersecting updates If Query A commits the data and delet If any conflicting files are detected during the commit stage of the query that commits last, we will have to re-execute the full query. was: During concurrent updates, Assume 2 concurrent update quries- Query A If any conflicting files are detected during the commit stage of the query that commits last, we will have to re-execute the full query. > Iceberg: Retry query when concurrent write queries fail due to conflicting > write > > > Key: HIVE-27646 > URL: https://issues.apache.org/jira/browse/HIVE-27646 > Project: Hive > Issue Type: Improvement >Reporter: Simhadri Govindappa >Assignee: Simhadri Govindappa >Priority: Major > > During concurrent updates, > Assume 2 concurrent update queries- Query A and Query B that have > insersecting updates > If Query A commits the data and delet > If any conflicting files are detected during the commit stage of the query > that commits last, we will have to re-execute the full query. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27646) Iceberg: Retry query when concurrent write queries fail due to conflicting write
[ https://issues.apache.org/jira/browse/HIVE-27646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simhadri Govindappa updated HIVE-27646: --- Description: During concurrent updates, Assume 2 concurrent update quries- Query A If any conflicting files are detected during the commit stage of the query that commits last, we will have to re-execute the full query. was: During concurrent updates, If any conflicting files are detected during the commit stage of the query that commits last , we will have to re-execuete the full query. > Iceberg: Retry query when concurrent write queries fail due to conflicting > write > > > Key: HIVE-27646 > URL: https://issues.apache.org/jira/browse/HIVE-27646 > Project: Hive > Issue Type: Improvement >Reporter: Simhadri Govindappa >Assignee: Simhadri Govindappa >Priority: Major > > During concurrent updates, > Assume 2 concurrent update quries- Query A > If any conflicting files are detected during the commit stage of the query > that commits last, we will have to re-execute the full query. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27646) Iceberg: Retry query when concurrent write queries fail due to conflicting write
[ https://issues.apache.org/jira/browse/HIVE-27646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simhadri Govindappa updated HIVE-27646: --- Description: During concurrent updates, If any conflicting files are detected during the commit stage of the query that commits last , we will have to re-execuete the full query. > Iceberg: Retry query when concurrent write queries fail due to conflicting > write > > > Key: HIVE-27646 > URL: https://issues.apache.org/jira/browse/HIVE-27646 > Project: Hive > Issue Type: Improvement >Reporter: Simhadri Govindappa >Assignee: Simhadri Govindappa >Priority: Major > > During concurrent updates, > If any conflicting files are detected during the commit stage of the query > that commits last , we will have to re-execuete the full query. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27646) Iceberg: Retry query when concurrent write queries fail due to conflicting write
[ https://issues.apache.org/jira/browse/HIVE-27646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simhadri Govindappa updated HIVE-27646: --- Summary: Iceberg: Retry query when concurrent write queries fail due to conflicting write (was: Iceberg: Re-execute query when concurrent writes fail due to conflicting write) > Iceberg: Retry query when concurrent write queries fail due to conflicting > write > > > Key: HIVE-27646 > URL: https://issues.apache.org/jira/browse/HIVE-27646 > Project: Hive > Issue Type: Improvement >Reporter: Simhadri Govindappa >Assignee: Simhadri Govindappa >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-27646) Iceberg: Re-execute query when concurrent writes fail due to conflicting write
Simhadri Govindappa created HIVE-27646: -- Summary: Iceberg: Re-execute query when concurrent writes fail due to conflicting write Key: HIVE-27646 URL: https://issues.apache.org/jira/browse/HIVE-27646 Project: Hive Issue Type: Improvement Reporter: Simhadri Govindappa Assignee: Simhadri Govindappa -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HIVE-27640) Counter for query concurrency
[ https://issues.apache.org/jira/browse/HIVE-27640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Bodor reassigned HIVE-27640: --- Assignee: László Bodor > Counter for query concurrency > - > > Key: HIVE-27640 > URL: https://issues.apache.org/jira/browse/HIVE-27640 > Project: Hive > Issue Type: Sub-task >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > > This is kind of hard to catch easily, but I would like to see > something/anything about query concurrency in the query counters. This way we > can instantly see in the query summary what happened. I mean counters like: > 1. how many queries were running when this query arrived > 2. same as 1) but in query stage level > 2a) how many queries were being compiled (or waiting for compilation) when > this query started to compile (or started to enqueued for compilation) > 2b) how many queries were waiting for a coordinator when this query started > to get a coordinator > 2c) how many queries were in the Run DAG phase, when this query started to > run DAG -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HIVE-27606) Backport of HIVE-21171: Skip creating scratch dirs for tez if RPC is on
[ https://issues.apache.org/jira/browse/HIVE-27606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sankar Hariappan resolved HIVE-27606. - Fix Version/s: 3.2.0 Resolution: Fixed > Backport of HIVE-21171: Skip creating scratch dirs for tez if RPC is on > --- > > Key: HIVE-27606 > URL: https://issues.apache.org/jira/browse/HIVE-27606 > Project: Hive > Issue Type: Sub-task >Reporter: Aman Raj >Assignee: Aman Raj >Priority: Major > Labels: pull-request-available > Fix For: 3.2.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HIVE-27607) Backport of HIVE-21182 Skip setting up hive scratch dir during planning
[ https://issues.apache.org/jira/browse/HIVE-27607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sankar Hariappan resolved HIVE-27607. - Fix Version/s: 3.2.0 Resolution: Fixed > Backport of HIVE-21182 Skip setting up hive scratch dir during planning > --- > > Key: HIVE-27607 > URL: https://issues.apache.org/jira/browse/HIVE-27607 > Project: Hive > Issue Type: Sub-task >Reporter: Aman Raj >Assignee: Aman Raj >Priority: Major > Labels: pull-request-available > Fix For: 3.2.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27630) Iceberg: Fast forward branch
[ https://issues.apache.org/jira/browse/HIVE-27630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-27630: -- Labels: pull-request-available (was: ) > Iceberg: Fast forward branch > > > Key: HIVE-27630 > URL: https://issues.apache.org/jira/browse/HIVE-27630 > Project: Hive > Issue Type: Sub-task >Reporter: Denys Kuzmenko >Assignee: Ayush Saxena >Priority: Major > Labels: pull-request-available > > Add support to fastForward main branch to the head of feature-branch to > update the main table state. > {code} > table.manageSnapshots().fastForward("main", "feature-branch").commit() > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27643) Exclude compaction queries from ranger policies
[ https://issues.apache.org/jira/browse/HIVE-27643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-27643: -- Labels: pull-request-available (was: ) > Exclude compaction queries from ranger policies > --- > > Key: HIVE-27643 > URL: https://issues.apache.org/jira/browse/HIVE-27643 > Project: Hive > Issue Type: Bug >Reporter: László Végh >Assignee: László Végh >Priority: Critical > Labels: pull-request-available > > Applying masking or filtering Ranger policies on the compaction users cause > data loss, as the policies will be applied to the compaction queries also. > While this is a kind of misconfiguration, the result is so bad, that the > users should be protected from it by automatically excluding compaction > queries from ALL ranger policies. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HIVE-27638) Preparing for 4.0.0-beta-2 development
[ https://issues.apache.org/jira/browse/HIVE-27638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stamatis Zampetakis resolved HIVE-27638. Fix Version/s: 4.0.0 Resolution: Fixed Fixed in https://github.com/apache/hive/commit/1ebef40aba00c4ec5376d8f0623196b42425a589. Thanks for the reviews [~ayushsaxena], [~aturoczy]. > Preparing for 4.0.0-beta-2 development > -- > > Key: HIVE-27638 > URL: https://issues.apache.org/jira/browse/HIVE-27638 > Project: Hive > Issue Type: Task >Reporter: Stamatis Zampetakis >Assignee: Stamatis Zampetakis >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > The main goal of this ticket is to increment the version and add the > necessary metastore upgrade scripts so we don't lose track of what changed > after the beta-1 release. > If later we decide to use another name (other than beta-2) that would be > completely fine (and hopefully a simple rename would do). The most important > thing in this change is to have the scripts in place so we don't mess up when > we push changes to the metastore schema. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27639) Query level performance counters
[ https://issues.apache.org/jira/browse/HIVE-27639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Bodor updated HIVE-27639: Summary: Query level performance counters (was: Performance counters for easier investigations) > Query level performance counters > > > Key: HIVE-27639 > URL: https://issues.apache.org/jira/browse/HIVE-27639 > Project: Hive > Issue Type: Improvement >Reporter: László Bodor >Priority: Major > > We need to move performance measurements to the next level and implement > whatever is needed to make us able to find easier answers to problems like > “query is slow”. The problem is that we keep digging into logs + watching > metrics that are provided by *something* in the environment (that can be > anything that the actual vendor implements outside of hive). > Let's try to localize the environment problems to the interval of the slow > query and make it exposed through counters. > Also, let's keep in mind that performance measurements ideally should never > cause performance problems itself: heavyweight measurements should be > disabled by default. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27630) Iceberg: Fast forward branch
[ https://issues.apache.org/jira/browse/HIVE-27630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena updated HIVE-27630: Summary: Iceberg: Fast forward branch (was: Iceberg: Fast forward/rebase branch) > Iceberg: Fast forward branch > > > Key: HIVE-27630 > URL: https://issues.apache.org/jira/browse/HIVE-27630 > Project: Hive > Issue Type: Sub-task >Reporter: Denys Kuzmenko >Assignee: Ayush Saxena >Priority: Major > > Add support to fastForward main branch to the head of feature-branch to > update the main table state. > {code} > table.manageSnapshots().fastForward("main", "feature-branch").commit() > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27645) Clean test cases by refactoring assertFalse(equals()) using assertNotEquals & @Test(excepted) using assertThrows
[ https://issues.apache.org/jira/browse/HIVE-27645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-27645: -- Labels: pull-request-available (was: ) > Clean test cases by refactoring assertFalse(equals()) using assertNotEquals & > @Test(excepted) using assertThrows > > > Key: HIVE-27645 > URL: https://issues.apache.org/jira/browse/HIVE-27645 > Project: Hive > Issue Type: Improvement >Reporter: Taher Ghaleb >Priority: Minor > Labels: pull-request-available > > I am working on research that investigates test smell refactoring in which we > identify alternative implementations of test cases, study how commonly used > these refactorings are, and assess how acceptable they are in practice. > The first smell is when inappropriate assertions are used, while there exist > better alternatives. For example, {{{_}assertNotEquals(x, y){_};}} is more > appropriate to use instead of {_}{{assertFalse(x.equals( y );}}{_}. > The second smell is when exception handling can alternatively be implemented > using assertion rather than annotation. For example, > _{{{}assertThrows({}}}{{{}Exception{}}}{{{}.class, () -> \{...});{}}}_ is > more appropriate to use instead of {{{}_@Test(expected = > Exception.class)_{}}}. > While there could be several cases like this, we aim in this pull request to > get your feedback on these particular test smells and their refactorings. > Thanks in advance for your input. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-27645) Clean test cases by refactoring assertFalse(equals()) using assertNotEquals & @Test(excepted) using assertThrows
Taher Ghaleb created HIVE-27645: --- Summary: Clean test cases by refactoring assertFalse(equals()) using assertNotEquals & @Test(excepted) using assertThrows Key: HIVE-27645 URL: https://issues.apache.org/jira/browse/HIVE-27645 Project: Hive Issue Type: Improvement Reporter: Taher Ghaleb I am working on research that investigates test smell refactoring in which we identify alternative implementations of test cases, study how commonly used these refactorings are, and assess how acceptable they are in practice. The first smell is when inappropriate assertions are used, while there exist better alternatives. For example, {{{_}assertNotEquals(x, y){_};}} is more appropriate to use instead of {_}{{assertFalse(x.equals(y));}}{_}. The second smell is when exception handling can alternatively be implemented using assertion rather than annotation. For example, _{{{}assertThrows({}}}{{{}Exception{}}}{{{}.class, () -> \{...});{}}}_ is more appropriate to use instead of {{{}_@Test(expected = Exception.class)_{}}}. While there could be several cases like this, we aim in this pull request to get your feedback on these particular test smells and their refactorings. Thanks in advance for your input. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27645) Clean test cases by refactoring assertFalse(equals()) using assertNotEquals & @Test(excepted) using assertThrows
[ https://issues.apache.org/jira/browse/HIVE-27645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Taher Ghaleb updated HIVE-27645: Description: I am working on research that investigates test smell refactoring in which we identify alternative implementations of test cases, study how commonly used these refactorings are, and assess how acceptable they are in practice. The first smell is when inappropriate assertions are used, while there exist better alternatives. For example, {{{_}assertNotEquals(x, y){_};}} is more appropriate to use instead of {_}{{assertFalse(x.equals( y );}}{_}. The second smell is when exception handling can alternatively be implemented using assertion rather than annotation. For example, _{{{}assertThrows({}}}{{{}Exception{}}}{{{}.class, () -> \{...});{}}}_ is more appropriate to use instead of {{{}_@Test(expected = Exception.class)_{}}}. While there could be several cases like this, we aim in this pull request to get your feedback on these particular test smells and their refactorings. Thanks in advance for your input. was: I am working on research that investigates test smell refactoring in which we identify alternative implementations of test cases, study how commonly used these refactorings are, and assess how acceptable they are in practice. The first smell is when inappropriate assertions are used, while there exist better alternatives. For example, {{{_}assertNotEquals(x, y){_};}} is more appropriate to use instead of {_}{{assertFalse(x.equals(y));}}{_}. The second smell is when exception handling can alternatively be implemented using assertion rather than annotation. For example, _{{{}assertThrows({}}}{{{}Exception{}}}{{{}.class, () -> \{...});{}}}_ is more appropriate to use instead of {{{}_@Test(expected = Exception.class)_{}}}. While there could be several cases like this, we aim in this pull request to get your feedback on these particular test smells and their refactorings. Thanks in advance for your input. > Clean test cases by refactoring assertFalse(equals()) using assertNotEquals & > @Test(excepted) using assertThrows > > > Key: HIVE-27645 > URL: https://issues.apache.org/jira/browse/HIVE-27645 > Project: Hive > Issue Type: Improvement >Reporter: Taher Ghaleb >Priority: Minor > > I am working on research that investigates test smell refactoring in which we > identify alternative implementations of test cases, study how commonly used > these refactorings are, and assess how acceptable they are in practice. > The first smell is when inappropriate assertions are used, while there exist > better alternatives. For example, {{{_}assertNotEquals(x, y){_};}} is more > appropriate to use instead of {_}{{assertFalse(x.equals( y );}}{_}. > The second smell is when exception handling can alternatively be implemented > using assertion rather than annotation. For example, > _{{{}assertThrows({}}}{{{}Exception{}}}{{{}.class, () -> \{...});{}}}_ is > more appropriate to use instead of {{{}_@Test(expected = > Exception.class)_{}}}. > While there could be several cases like this, we aim in this pull request to > get your feedback on these particular test smells and their refactorings. > Thanks in advance for your input. -- This message was sent by Atlassian Jira (v8.20.10#820010)