[jira] [Created] (HIVE-28503) Wrong results(NULL) when string concat operation with || operator for ORC file format when vectorization enabled
Mahesh Raju Somalaraju created HIVE-28503: - Summary: Wrong results(NULL) when string concat operation with || operator for ORC file format when vectorization enabled Key: HIVE-28503 URL: https://issues.apache.org/jira/browse/HIVE-28503 Project: Hive Issue Type: Bug Components: Hive Reporter: Mahesh Raju Somalaraju Assignee: Mahesh Raju Somalaraju Wrong results(NULL) when string concat operation with || operator for ORC file format when vectorization enabled. set hive.query.results.cache.enabled=false; set hive.fetch.task.conversion=none; set hive.vectorized.execution.enabled=true; Result is NULL when we do concat operation with || operator. Locally it is not able to reproduce. It is able to reproduce in cluster with more records.Input data should be mix of NULL and NOT NULL values something like below. create a table with orc file format and has 3 string columns and insert data such way that it should have mix of NULL values and NOT NULL values. |column1|column2|column3|count| |NULL |NULL |NULL |18000 | |G |L |A1 |123932 | with above configuration, perform concat() operation with || operator and insert new row with the concat() results. select * from (select t1.column1, t1.column2, t1.column3, *t1.column1 || t1.column2 || t1.column3 as VEH_MODEL_ID* from test_table t1 )t where VEH_MODEL_ID is NULL and if(column1 is null,0,1)=1 AND if(column2 is null,0,1)=1 AND if(column3 is null,0,1)=1 limit 1; in above query, *t1.column1 || t1.column2 || t1.column3 as VEH_MODEL_ID* operation is returning the NULL result eventhough the input string values are not null. |t.VEH_MODEL_ID|t.column1|t.column2|t.column3| |NULL|G|L|A2| +Proposed solution as per code review:+ +*Root cause:*+ While doing concat() operation, In *StringGroupConcatColCol* class, if input batch vector has mixed of NULL and NOT NULL values of inputs then we are not setting output vector batch flags related to NULL and NOT NULLS correctly . Each value in the vector has the flag whether it is NULL or NOT NULL. But here we are not setting correctly the whole output vector flag (outV.noNulls). Without this flag it is working for parquet, some how they may be referring each value instead of checking whole output vector flag whether it is NULL or NOT NULL. +*code snippet:*+ *StringGroupConcatColCol->evaluate() method:* if (inV1.noNulls && !inV2.noNulls) { *>> if any one input has NULL, then output should be NULL.* outV.noNulls = false; *--> setting this flag false as all values in this are NULLs* } else if (!inV1.noNulls && inV2.noNulls) { *>> if any one input has NULL, then output should be NULL.* outV.noNulls = false; --> *setting this flag false as all values in this are NULLs* --- } else if (!inV1.noNulls && !inV2.noNulls) { *>> if two inputs are NULL, then output should be NULL.* outV.noNulls = false; *--> setting this flag false as all values in this are NULLs** --- } else { *--> there are no nulls in either input vector* {color:#4c9aff}*outV.noNulls = true; --> this has to be set true, as there are no NULL values, this check is missed currently.*{color} // perform data operation --- } -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27512) CalciteSemanticException.UnsupportedFeature enum to capital
[ https://issues.apache.org/jira/browse/HIVE-27512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahesh Raju Somalaraju updated HIVE-27512: -- Status: Patch Available (was: In Progress) > CalciteSemanticException.UnsupportedFeature enum to capital > --- > > Key: HIVE-27512 > URL: https://issues.apache.org/jira/browse/HIVE-27512 > Project: Hive > Issue Type: Improvement >Reporter: László Bodor >Assignee: Mahesh Raju Somalaraju >Priority: Major > Labels: newbie, pull-request-available > > https://github.com/apache/hive/blob/3bc62cbc2d42c22dfd55f78ad7b41ec84a71380f/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/CalciteSemanticException.java#L32-L39 > {code} > public enum UnsupportedFeature { > Distinct_without_an_aggreggation, Duplicates_in_RR, > Filter_expression_with_non_boolean_return_type, > Having_clause_without_any_groupby, Invalid_column_reference, > Invalid_decimal, > Less_than_equal_greater_than, Others, Same_name_in_multiple_expressions, > Schema_less_table, Select_alias_in_having_clause, Select_transform, > Subquery, > Table_sample_clauses, UDTF, Union_type, Unique_join, > HighPrecissionTimestamp // CALCITE-1690 > }; > {code} > this just hurts my eyes, I expect it as DISTINCT_WITHOUT_AN_AGGREGATION ... -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work started] (HIVE-27512) CalciteSemanticException.UnsupportedFeature enum to capital
[ https://issues.apache.org/jira/browse/HIVE-27512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-27512 started by Mahesh Raju Somalaraju. - > CalciteSemanticException.UnsupportedFeature enum to capital > --- > > Key: HIVE-27512 > URL: https://issues.apache.org/jira/browse/HIVE-27512 > Project: Hive > Issue Type: Improvement >Reporter: László Bodor >Assignee: Mahesh Raju Somalaraju >Priority: Major > Labels: newbie > > https://github.com/apache/hive/blob/3bc62cbc2d42c22dfd55f78ad7b41ec84a71380f/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/CalciteSemanticException.java#L32-L39 > {code} > public enum UnsupportedFeature { > Distinct_without_an_aggreggation, Duplicates_in_RR, > Filter_expression_with_non_boolean_return_type, > Having_clause_without_any_groupby, Invalid_column_reference, > Invalid_decimal, > Less_than_equal_greater_than, Others, Same_name_in_multiple_expressions, > Schema_less_table, Select_alias_in_having_clause, Select_transform, > Subquery, > Table_sample_clauses, UDTF, Union_type, Unique_join, > HighPrecissionTimestamp // CALCITE-1690 > }; > {code} > this just hurts my eyes, I expect it as DISTINCT_WITHOUT_AN_AGGREGATION ... -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-27512) CalciteSemanticException.UnsupportedFeature enum to capital
[ https://issues.apache.org/jira/browse/HIVE-27512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17788299#comment-17788299 ] Mahesh Raju Somalaraju commented on HIVE-27512: --- [~abstractdog] i have assigned this Jira to myself and will raise PR. > CalciteSemanticException.UnsupportedFeature enum to capital > --- > > Key: HIVE-27512 > URL: https://issues.apache.org/jira/browse/HIVE-27512 > Project: Hive > Issue Type: Improvement >Reporter: László Bodor >Assignee: Mahesh Raju Somalaraju >Priority: Major > Labels: newbie > > https://github.com/apache/hive/blob/3bc62cbc2d42c22dfd55f78ad7b41ec84a71380f/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/CalciteSemanticException.java#L32-L39 > {code} > public enum UnsupportedFeature { > Distinct_without_an_aggreggation, Duplicates_in_RR, > Filter_expression_with_non_boolean_return_type, > Having_clause_without_any_groupby, Invalid_column_reference, > Invalid_decimal, > Less_than_equal_greater_than, Others, Same_name_in_multiple_expressions, > Schema_less_table, Select_alias_in_having_clause, Select_transform, > Subquery, > Table_sample_clauses, UDTF, Union_type, Unique_join, > HighPrecissionTimestamp // CALCITE-1690 > }; > {code} > this just hurts my eyes, I expect it as DISTINCT_WITHOUT_AN_AGGREGATION ... -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HIVE-27512) CalciteSemanticException.UnsupportedFeature enum to capital
[ https://issues.apache.org/jira/browse/HIVE-27512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahesh Raju Somalaraju reassigned HIVE-27512: - Assignee: Mahesh Raju Somalaraju > CalciteSemanticException.UnsupportedFeature enum to capital > --- > > Key: HIVE-27512 > URL: https://issues.apache.org/jira/browse/HIVE-27512 > Project: Hive > Issue Type: Improvement >Reporter: László Bodor >Assignee: Mahesh Raju Somalaraju >Priority: Major > Labels: newbie > > https://github.com/apache/hive/blob/3bc62cbc2d42c22dfd55f78ad7b41ec84a71380f/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/CalciteSemanticException.java#L32-L39 > {code} > public enum UnsupportedFeature { > Distinct_without_an_aggreggation, Duplicates_in_RR, > Filter_expression_with_non_boolean_return_type, > Having_clause_without_any_groupby, Invalid_column_reference, > Invalid_decimal, > Less_than_equal_greater_than, Others, Same_name_in_multiple_expressions, > Schema_less_table, Select_alias_in_having_clause, Select_transform, > Subquery, > Table_sample_clauses, UDTF, Union_type, Unique_join, > HighPrecissionTimestamp // CALCITE-1690 > }; > {code} > this just hurts my eyes, I expect it as DISTINCT_WITHOUT_AN_AGGREGATION ... -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HIVE-27703) Remove PowerMock from itests/hive-jmh and upgrade mockito to 4.11
[ https://issues.apache.org/jira/browse/HIVE-27703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahesh Raju Somalaraju resolved HIVE-27703. --- Resolution: Duplicate Handled in the part of : https://issues.apache.org/jira/browse/HIVE-27736 > Remove PowerMock from itests/hive-jmh and upgrade mockito to 4.11 > - > > Key: HIVE-27703 > URL: https://issues.apache.org/jira/browse/HIVE-27703 > Project: Hive > Issue Type: Task > Components: HiveServer2 >Reporter: Zsolt Miskolczi >Priority: Major > Labels: newbie, starter > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27871) Fix some formatting problems is YarnQueueHelper
[ https://issues.apache.org/jira/browse/HIVE-27871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahesh Raju Somalaraju updated HIVE-27871: -- Status: Patch Available (was: Open) > Fix some formatting problems is YarnQueueHelper > --- > > Key: HIVE-27871 > URL: https://issues.apache.org/jira/browse/HIVE-27871 > Project: Hive > Issue Type: Improvement >Reporter: László Bodor >Assignee: Mahesh Raju Somalaraju >Priority: Major > Labels: newbie, pull-request-available > > https://github.com/apache/hive/blob/cbc5d2d7d650f90882c5c4ad0026a94d2e586acb/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/YarnQueueHelper.java#L54-L57 > {code} > private static String webapp_conf_key = YarnConfiguration.RM_WEBAPP_ADDRESS; > private static String webapp_ssl_conf_key = > YarnConfiguration.RM_WEBAPP_HTTPS_ADDRESS; > private static String yarn_HA_enabled = YarnConfiguration.RM_HA_ENABLED; > private static String yarn_HA_rmids = YarnConfiguration.RM_HA_IDS; > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HIVE-27871) Fix some formatting problems is YarnQueueHelper
[ https://issues.apache.org/jira/browse/HIVE-27871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahesh Raju Somalaraju reassigned HIVE-27871: - Assignee: Mahesh Raju Somalaraju > Fix some formatting problems is YarnQueueHelper > --- > > Key: HIVE-27871 > URL: https://issues.apache.org/jira/browse/HIVE-27871 > Project: Hive > Issue Type: Improvement >Reporter: László Bodor >Assignee: Mahesh Raju Somalaraju >Priority: Major > Labels: newbie > > https://github.com/apache/hive/blob/cbc5d2d7d650f90882c5c4ad0026a94d2e586acb/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/YarnQueueHelper.java#L54-L57 > {code} > private static String webapp_conf_key = YarnConfiguration.RM_WEBAPP_ADDRESS; > private static String webapp_ssl_conf_key = > YarnConfiguration.RM_WEBAPP_HTTPS_ADDRESS; > private static String yarn_HA_enabled = YarnConfiguration.RM_HA_ENABLED; > private static String yarn_HA_rmids = YarnConfiguration.RM_HA_IDS; > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HIVE-27679) Ranger Yarn Queue policies are not applying correctly, rework done for HIVE-26352
[ https://issues.apache.org/jira/browse/HIVE-27679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahesh Raju Somalaraju reassigned HIVE-27679: - Assignee: Mahesh Raju Somalaraju > Ranger Yarn Queue policies are not applying correctly, rework done for > HIVE-26352 > - > > Key: HIVE-27679 > URL: https://issues.apache.org/jira/browse/HIVE-27679 > Project: Hive > Issue Type: Bug >Reporter: Mahesh Raju Somalaraju >Assignee: Mahesh Raju Somalaraju >Priority: Major > > This Jira is raised to modify/fix the code which is done in part of > *HIVE-26352.* > Versions which have {*}HIVE-26352{*}/HIVE-27029 are not able to enforce Yarn > Ranger queue policies, because the change made in {*}HIVE-26352{*}/HIVE-27029 > catches ALL exceptions, so exceptions normally thrown is ignored and the user > is allowed to run a job in that queue. > Allowing user to run jobs is not expecting behaviour. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-27679) Ranger Yarn Queue policies are not applying correctly, rework done for HIVE-26352
Mahesh Raju Somalaraju created HIVE-27679: - Summary: Ranger Yarn Queue policies are not applying correctly, rework done for HIVE-26352 Key: HIVE-27679 URL: https://issues.apache.org/jira/browse/HIVE-27679 Project: Hive Issue Type: Bug Reporter: Mahesh Raju Somalaraju This Jira is raised to modify/fix the code which is done in part of *HIVE-26352.* Versions which have {*}HIVE-26352{*}/HIVE-27029 are not able to enforce Yarn Ranger queue policies, because the change made in {*}HIVE-26352{*}/HIVE-27029 catches ALL exceptions, so exceptions normally thrown is ignored and the user is allowed to run a job in that queue. Allowing user to run jobs is not expecting behaviour. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HIVE-27303) select query result is different when enable/disable mapjoin with UNION ALL
[ https://issues.apache.org/jira/browse/HIVE-27303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahesh Raju Somalaraju resolved HIVE-27303. --- Resolution: Fixed > select query result is different when enable/disable mapjoin with UNION ALL > --- > > Key: HIVE-27303 > URL: https://issues.apache.org/jira/browse/HIVE-27303 > Project: Hive > Issue Type: Bug >Reporter: Mahesh Raju Somalaraju >Assignee: Mahesh Raju Somalaraju >Priority: Major > Labels: pull-request-available > > select query result is different when enable/disable mapjoin with UNION ALL > Below are the reproduce steps. > As per query when map.join is disabled it should not give rows(duplicate). > Same is working fine with map.join=true. > Expected result: Empty rows. > Problem: returning duplicate rows. > Steps: > -- > SET hive.server2.tez.queue.access.check=true; > SET tez.queue.name=default > SET hive.query.results.cache.enabled=false; > SET hive.fetch.task.conversion=none; > SET hive.execution.engine=tez; > SET hive.stats.autogather=true; > SET hive.server2.enable.doAs=false; > SET hive.auto.convert.join=false; > drop table if exists hive1_tbl_data; > drop table if exists hive2_tbl_data; > drop table if exists hive3_tbl_data; > drop table if exists hive4_tbl_data; > CREATE EXTERNAL TABLE hive1_tbl_data (COLUMID string,COLUMN_FN > string,COLUMN_LN string,EMAIL string,COL_UPDATED_DATE timestamp, PK_COLUM > string) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' > TBLPROPERTIES ( > 'TRANSLATED_TO_EXTERNAL'='true', > 'bucketing_version'='2', > 'external.table.purge'='true', > 'parquet.compression'='SNAPPY'); > CREATE EXTERNAL TABLE hive2_tbl_data (COLUMID string,COLUMN_FN > string,COLUMN_LN string,EMAIL string,COL_UPDATED_DATE timestamp, PK_COLUM > string) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' > TBLPROPERTIES ( > 'TRANSLATED_TO_EXTERNAL'='true', > 'bucketing_version'='2', > 'external.table.purge'='true', > 'parquet.compression'='SNAPPY'); > CREATE EXTERNAL TABLE hive3_tbl_data (COLUMID string,COLUMN_FN > string,COLUMN_LN string,EMAIL string,COL_UPDATED_DATE timestamp, PK_COLUM > string) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' > TBLPROPERTIES ( > 'TRANSLATED_TO_EXTERNAL'='true', > 'bucketing_version'='2', > 'external.table.purge'='true', > 'parquet.compression'='SNAPPY'); > CREATE EXTERNAL TABLE hive4_tbl_data (COLUMID string,COLUMN_FN > string,COLUMN_LN string,EMAIL string,COL_UPDATED_DATE timestamp, PK_COLUM > string) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' > TBLPROPERTIES ( > 'TRANSLATED_TO_EXTERNAL'='true', > 'bucketing_version'='2', > 'external.table.purge'='true', > 'parquet.compression'='SNAPPY'); > > insert into table hive1_tbl_data select > '1','john','doe','j...@hotmail.com','2014-01-01 12:01:02','4000-1'; > insert into table hive1_tbl_data select > '2','john','doe','j...@hotmail.com','2014-01-01 > 12:01:02','4000-1';insert into table hive2_tbl_data select > '1','john','doe','j...@hotmail.com','201
[jira] [Commented] (HIVE-27303) select query result is different when enable/disable mapjoin with UNION ALL
[ https://issues.apache.org/jira/browse/HIVE-27303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17758346#comment-17758346 ] Mahesh Raju Somalaraju commented on HIVE-27303: --- Merged the PR. https://github.com/apache/hive/pull/4406 > select query result is different when enable/disable mapjoin with UNION ALL > --- > > Key: HIVE-27303 > URL: https://issues.apache.org/jira/browse/HIVE-27303 > Project: Hive > Issue Type: Bug >Reporter: Mahesh Raju Somalaraju >Assignee: Mahesh Raju Somalaraju >Priority: Major > Labels: pull-request-available > > select query result is different when enable/disable mapjoin with UNION ALL > Below are the reproduce steps. > As per query when map.join is disabled it should not give rows(duplicate). > Same is working fine with map.join=true. > Expected result: Empty rows. > Problem: returning duplicate rows. > Steps: > -- > SET hive.server2.tez.queue.access.check=true; > SET tez.queue.name=default > SET hive.query.results.cache.enabled=false; > SET hive.fetch.task.conversion=none; > SET hive.execution.engine=tez; > SET hive.stats.autogather=true; > SET hive.server2.enable.doAs=false; > SET hive.auto.convert.join=false; > drop table if exists hive1_tbl_data; > drop table if exists hive2_tbl_data; > drop table if exists hive3_tbl_data; > drop table if exists hive4_tbl_data; > CREATE EXTERNAL TABLE hive1_tbl_data (COLUMID string,COLUMN_FN > string,COLUMN_LN string,EMAIL string,COL_UPDATED_DATE timestamp, PK_COLUM > string) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' > TBLPROPERTIES ( > 'TRANSLATED_TO_EXTERNAL'='true', > 'bucketing_version'='2', > 'external.table.purge'='true', > 'parquet.compression'='SNAPPY'); > CREATE EXTERNAL TABLE hive2_tbl_data (COLUMID string,COLUMN_FN > string,COLUMN_LN string,EMAIL string,COL_UPDATED_DATE timestamp, PK_COLUM > string) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' > TBLPROPERTIES ( > 'TRANSLATED_TO_EXTERNAL'='true', > 'bucketing_version'='2', > 'external.table.purge'='true', > 'parquet.compression'='SNAPPY'); > CREATE EXTERNAL TABLE hive3_tbl_data (COLUMID string,COLUMN_FN > string,COLUMN_LN string,EMAIL string,COL_UPDATED_DATE timestamp, PK_COLUM > string) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' > TBLPROPERTIES ( > 'TRANSLATED_TO_EXTERNAL'='true', > 'bucketing_version'='2', > 'external.table.purge'='true', > 'parquet.compression'='SNAPPY'); > CREATE EXTERNAL TABLE hive4_tbl_data (COLUMID string,COLUMN_FN > string,COLUMN_LN string,EMAIL string,COL_UPDATED_DATE timestamp, PK_COLUM > string) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' > TBLPROPERTIES ( > 'TRANSLATED_TO_EXTERNAL'='true', > 'bucketing_version'='2', > 'external.table.purge'='true', > 'parquet.compression'='SNAPPY'); > > insert into table hive1_tbl_data select > '1','john','doe','j...@hotmail.com','2014-01-01 12:01:02','4000-1'; > insert into table hive1_tbl_data select > '2','john','doe','j...@hotmail.com','2014-01-01 > 12:01:02','4000-100
[jira] [Commented] (HIVE-27303) select query result is different when enable/disable mapjoin with UNION ALL
[ https://issues.apache.org/jira/browse/HIVE-27303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17749470#comment-17749470 ] Mahesh Raju Somalaraju commented on HIVE-27303: --- [~seonggon] Thanks for your PR, let me verify it. If fix is working fine then you can merge the PR. > select query result is different when enable/disable mapjoin with UNION ALL > --- > > Key: HIVE-27303 > URL: https://issues.apache.org/jira/browse/HIVE-27303 > Project: Hive > Issue Type: Bug >Reporter: Mahesh Raju Somalaraju >Assignee: Mahesh Raju Somalaraju >Priority: Major > Labels: pull-request-available > > select query result is different when enable/disable mapjoin with UNION ALL > Below are the reproduce steps. > As per query when map.join is disabled it should not give rows(duplicate). > Same is working fine with map.join=true. > Expected result: Empty rows. > Problem: returning duplicate rows. > Steps: > -- > SET hive.server2.tez.queue.access.check=true; > SET tez.queue.name=default > SET hive.query.results.cache.enabled=false; > SET hive.fetch.task.conversion=none; > SET hive.execution.engine=tez; > SET hive.stats.autogather=true; > SET hive.server2.enable.doAs=false; > SET hive.auto.convert.join=false; > drop table if exists hive1_tbl_data; > drop table if exists hive2_tbl_data; > drop table if exists hive3_tbl_data; > drop table if exists hive4_tbl_data; > CREATE EXTERNAL TABLE hive1_tbl_data (COLUMID string,COLUMN_FN > string,COLUMN_LN string,EMAIL string,COL_UPDATED_DATE timestamp, PK_COLUM > string) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' > TBLPROPERTIES ( > 'TRANSLATED_TO_EXTERNAL'='true', > 'bucketing_version'='2', > 'external.table.purge'='true', > 'parquet.compression'='SNAPPY'); > CREATE EXTERNAL TABLE hive2_tbl_data (COLUMID string,COLUMN_FN > string,COLUMN_LN string,EMAIL string,COL_UPDATED_DATE timestamp, PK_COLUM > string) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' > TBLPROPERTIES ( > 'TRANSLATED_TO_EXTERNAL'='true', > 'bucketing_version'='2', > 'external.table.purge'='true', > 'parquet.compression'='SNAPPY'); > CREATE EXTERNAL TABLE hive3_tbl_data (COLUMID string,COLUMN_FN > string,COLUMN_LN string,EMAIL string,COL_UPDATED_DATE timestamp, PK_COLUM > string) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' > TBLPROPERTIES ( > 'TRANSLATED_TO_EXTERNAL'='true', > 'bucketing_version'='2', > 'external.table.purge'='true', > 'parquet.compression'='SNAPPY'); > CREATE EXTERNAL TABLE hive4_tbl_data (COLUMID string,COLUMN_FN > string,COLUMN_LN string,EMAIL string,COL_UPDATED_DATE timestamp, PK_COLUM > string) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' > TBLPROPERTIES ( > 'TRANSLATED_TO_EXTERNAL'='true', > 'bucketing_version'='2', > 'external.table.purge'='true', > 'parquet.compression'='SNAPPY'); > > insert into table hive1_tbl_data select > '1','john','doe','j...@hotmail.com','2014-01-01 12:01:02','4000-1'; > insert into table hive1_tbl_data select > '2','john','doe','j...@hot
[jira] [Updated] (HIVE-27303) select query result is different when enable/disable mapjoin with UNION ALL
[ https://issues.apache.org/jira/browse/HIVE-27303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahesh Raju Somalaraju updated HIVE-27303: -- Description: select query result is different when enable/disable mapjoin with UNION ALL Below are the reproduce steps. As per query when map.join is disabled it should not give rows(duplicate). Same is working fine with map.join=true. Expected result: Empty rows. Problem: returning duplicate rows. Steps: -- SET hive.server2.tez.queue.access.check=true; SET tez.queue.name=default SET hive.query.results.cache.enabled=false; SET hive.fetch.task.conversion=none; SET hive.execution.engine=tez; SET hive.stats.autogather=true; SET hive.server2.enable.doAs=false; SET hive.auto.convert.join=false; drop table if exists hive1_tbl_data; drop table if exists hive2_tbl_data; drop table if exists hive3_tbl_data; drop table if exists hive4_tbl_data; CREATE EXTERNAL TABLE hive1_tbl_data (COLUMID string,COLUMN_FN string,COLUMN_LN string,EMAIL string,COL_UPDATED_DATE timestamp, PK_COLUM string) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' TBLPROPERTIES ( 'TRANSLATED_TO_EXTERNAL'='true', 'bucketing_version'='2', 'external.table.purge'='true', 'parquet.compression'='SNAPPY'); CREATE EXTERNAL TABLE hive2_tbl_data (COLUMID string,COLUMN_FN string,COLUMN_LN string,EMAIL string,COL_UPDATED_DATE timestamp, PK_COLUM string) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' TBLPROPERTIES ( 'TRANSLATED_TO_EXTERNAL'='true', 'bucketing_version'='2', 'external.table.purge'='true', 'parquet.compression'='SNAPPY'); CREATE EXTERNAL TABLE hive3_tbl_data (COLUMID string,COLUMN_FN string,COLUMN_LN string,EMAIL string,COL_UPDATED_DATE timestamp, PK_COLUM string) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' TBLPROPERTIES ( 'TRANSLATED_TO_EXTERNAL'='true', 'bucketing_version'='2', 'external.table.purge'='true', 'parquet.compression'='SNAPPY'); CREATE EXTERNAL TABLE hive4_tbl_data (COLUMID string,COLUMN_FN string,COLUMN_LN string,EMAIL string,COL_UPDATED_DATE timestamp, PK_COLUM string) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' TBLPROPERTIES ( 'TRANSLATED_TO_EXTERNAL'='true', 'bucketing_version'='2', 'external.table.purge'='true', 'parquet.compression'='SNAPPY'); insert into table hive1_tbl_data select '1','john','doe','j...@hotmail.com','2014-01-01 12:01:02','4000-1'; insert into table hive1_tbl_data select '2','john','doe','j...@hotmail.com','2014-01-01 12:01:02','4000-1';insert into table hive2_tbl_data select '1','john','doe','j...@hotmail.com','2014-01-01 12:01:02','1'; insert into table hive2_tbl_data select '2','john','doe','j...@hotmail.com','2014-01-01 12:01:02','1'; select t.COLUMID from ( select distinct t.COLUMID as COLUMID from (SELECT COLUMID FROM hive3_tbl_data UNION ALL SELECT COLUMID FROM hive1_tbl_data) t ) t left join ( select distinct t.COLUMID from (SELECT COLUMID FROM hive4_tbl_data UNION ALL SELECT COLUMID FROM hive2_tbl_data) t ) t1 on t.COLUMID = t1.COLUMID where t1.COLUMID is null; was: select query result is different when enable/disable mapjoin with UNION ALL Below are the
[jira] [Created] (HIVE-27303) select query result is different when enable/disable mapjoin with UNION ALL
Mahesh Raju Somalaraju created HIVE-27303: - Summary: select query result is different when enable/disable mapjoin with UNION ALL Key: HIVE-27303 URL: https://issues.apache.org/jira/browse/HIVE-27303 Project: Hive Issue Type: Bug Reporter: Mahesh Raju Somalaraju Assignee: Mahesh Raju Somalaraju select query result is different when enable/disable mapjoin with UNION ALL Below are the reproduce steps. As per query when map.join is disabled it should not give rows(duplicate). Same is working fine with map.join=true. Expected result: Empty rows. Problem: returning duplicate rows. Steps: -- SET hive.server2.tez.queue.access.check=true; SET tez.queue.name=default SET hive.query.results.cache.enabled=false; SET hive.fetch.task.conversion=none; SET hive.execution.engine=tez; SET hive.stats.autogather=true; SET hive.server2.enable.doAs=false; SET hive.auto.convert.join=true; drop table if exists hive1_tbl_data; drop table if exists hive2_tbl_data; drop table if exists hive3_tbl_data; drop table if exists hive4_tbl_data; CREATE EXTERNAL TABLE hive1_tbl_data (COLUMID string,COLUMN_FN string,COLUMN_LN string,EMAIL string,COL_UPDATED_DATE timestamp, PK_COLUM string) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' TBLPROPERTIES ( 'TRANSLATED_TO_EXTERNAL'='true', 'bucketing_version'='2', 'external.table.purge'='true', 'parquet.compression'='SNAPPY'); CREATE EXTERNAL TABLE hive2_tbl_data (COLUMID string,COLUMN_FN string,COLUMN_LN string,EMAIL string,COL_UPDATED_DATE timestamp, PK_COLUM string) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' TBLPROPERTIES ( 'TRANSLATED_TO_EXTERNAL'='true', 'bucketing_version'='2', 'external.table.purge'='true', 'parquet.compression'='SNAPPY'); CREATE EXTERNAL TABLE hive3_tbl_data (COLUMID string,COLUMN_FN string,COLUMN_LN string,EMAIL string,COL_UPDATED_DATE timestamp, PK_COLUM string) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' TBLPROPERTIES ( 'TRANSLATED_TO_EXTERNAL'='true', 'bucketing_version'='2', 'external.table.purge'='true', 'parquet.compression'='SNAPPY'); CREATE EXTERNAL TABLE hive4_tbl_data (COLUMID string,COLUMN_FN string,COLUMN_LN string,EMAIL string,COL_UPDATED_DATE timestamp, PK_COLUM string) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' TBLPROPERTIES ( 'TRANSLATED_TO_EXTERNAL'='true', 'bucketing_version'='2', 'external.table.purge'='true', 'parquet.compression'='SNAPPY'); insert into table hive1_tbl_data select '1','john','doe','j...@hotmail.com','2014-01-01 12:01:02','4000-1'; insert into table hive1_tbl_data select '2','john','doe','j...@hotmail.com','2014-01-01 12:01:02','4000-1';insert into table hive2_tbl_data select '1','john','doe','j...@hotmail.com','2014-01-01 12:01:02','1'; insert into table hive2_tbl_data select '2','john','doe','j...@hotmail.com','2014-01-01 12:01:02','1'; select t.COLUMID from ( select distinct t.COLUMID as COLUMID from (SELECT COLUMID FROM hive3_tbl_data UNION ALL SELECT COLUMID FROM hive1_tbl_data) t ) t left join ( select distinct t.COLUMID from (SELECT COLUMID FROM hive4_tbl_data UNI
[jira] [Resolved] (HIVE-27196) Upgrade jettision version to 1.5.4 due to CVEs
[ https://issues.apache.org/jira/browse/HIVE-27196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahesh Raju Somalaraju resolved HIVE-27196. --- Resolution: Duplicate this is fixed in part of HIVE-27286. Hence closing this jira. > Upgrade jettision version to 1.5.4 due to CVEs > -- > > Key: HIVE-27196 > URL: https://issues.apache.org/jira/browse/HIVE-27196 > Project: Hive > Issue Type: Improvement >Reporter: Mahesh Raju Somalaraju >Assignee: Mahesh Raju Somalaraju >Priority: Major > > [CVE-2023-1436|https://www.cve.org/CVERecord?id=CVE-2023-1436] > [CWE-400|https://cwe.mitre.org/data/definitions/400.html] > Need to update jettison version to 1.5.4 version due to above CVE issues. > version 1.5.4 has no CVE issues. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-27196) Upgrade jettision version to 1.5.4 due to CVEs
[ https://issues.apache.org/jira/browse/HIVE-27196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17716084#comment-17716084 ] Mahesh Raju Somalaraju commented on HIVE-27196: --- this is fixed in part of HIVE-27286. Hence closing this jira. > Upgrade jettision version to 1.5.4 due to CVEs > -- > > Key: HIVE-27196 > URL: https://issues.apache.org/jira/browse/HIVE-27196 > Project: Hive > Issue Type: Improvement >Reporter: Mahesh Raju Somalaraju >Assignee: Mahesh Raju Somalaraju >Priority: Major > > [CVE-2023-1436|https://www.cve.org/CVERecord?id=CVE-2023-1436] > [CWE-400|https://cwe.mitre.org/data/definitions/400.html] > Need to update jettison version to 1.5.4 version due to above CVE issues. > version 1.5.4 has no CVE issues. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HIVE-27198) Delete directly aborted transactions instead of select and loading ids
[ https://issues.apache.org/jira/browse/HIVE-27198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahesh Raju Somalaraju reassigned HIVE-27198: - > Delete directly aborted transactions instead of select and loading ids > -- > > Key: HIVE-27198 > URL: https://issues.apache.org/jira/browse/HIVE-27198 > Project: Hive > Issue Type: Improvement >Reporter: Mahesh Raju Somalaraju >Assignee: Mahesh Raju Somalaraju >Priority: Major > > in cleaning the aborted transaction , we can directly deletes the txns > instead of selecting and process. > method name: > cleanEmptyAbortedAndCommittedTxns > Code: > String s = "SELECT \"TXN_ID\" FROM \"TXNS\" WHERE " + > "\"TXN_ID\" NOT IN (SELECT \"TC_TXNID\" FROM \"TXN_COMPONENTS\") AND " + > " (\"TXN_STATE\" = " + TxnStatus.ABORTED + " OR \"TXN_STATE\" = " + > TxnStatus.COMMITTED + ") AND " > + " \"TXN_ID\" < " + lowWaterMark; > > proposed code: > String s = "DELETE \"TXN_ID\" FROM \"TXNS\" WHERE " + > "\"TXN_ID\" NOT IN (SELECT \"TC_TXNID\" FROM \"TXN_COMPONENTS\") AND " + > " (\"TXN_STATE\" = " + TxnStatus.ABORTED + " OR \"TXN_STATE\" = " + > TxnStatus.COMMITTED + ") AND " > + " \"TXN_ID\" < " + lowWaterMark; > > the select needs to be eliminated and the delete should work with the where > clause instead of the built in clause > we can see no reason for loading the ids into memory and then generate a huge > sql > > Bathcing is also not necessary here, we can deletes the records directly -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HIVE-27196) Upgrade jettision version to 1.5.4 due to CVEs
[ https://issues.apache.org/jira/browse/HIVE-27196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahesh Raju Somalaraju reassigned HIVE-27196: - > Upgrade jettision version to 1.5.4 due to CVEs > -- > > Key: HIVE-27196 > URL: https://issues.apache.org/jira/browse/HIVE-27196 > Project: Hive > Issue Type: Improvement >Reporter: Mahesh Raju Somalaraju >Assignee: Mahesh Raju Somalaraju >Priority: Major > > [CVE-2023-1436|https://www.cve.org/CVERecord?id=CVE-2023-1436] > [CWE-400|https://cwe.mitre.org/data/definitions/400.html] > Need to update jettison version to 1.5.4 version due to above CVE issues. > version 1.5.4 has no CVE issues. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27029) hive query fails with Filesystem closed error
[ https://issues.apache.org/jira/browse/HIVE-27029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahesh Raju Somalaraju updated HIVE-27029: -- Description: This Jira is raised to modify/fix the code which is done in part of *HIVE-26352.* we should remove the finally block as this is causing the filesystem close errors. String queueName, String userName) throws IOException, InterruptedException { UserGroupInformation ugi = UserGroupInformation.getCurrentUser(); try { ugi.doAs((PrivilegedExceptionAction) () -> { checkQueueAccessInternal(queueName, userName); return null; }); } {color:#0747a6}*finally {*{color} {color:#0747a6} *try {*{color} {color:#0747a6} *FileSystem.closeAllForUGI(ugi);*{color} } catch (IOException exception) { LOG.error("Could not clean up file-system handles for UGI: " + ugi, exception); } } } Caused by: java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:483) ~[hadoop-hdfs-client-3.1.1.7.1.8.11-3.jar:?] at org.apache.hadoop.hdfs.DFSClient.getEZForPath(DFSClient.java:2771) ~[hadoop-hdfs-client-3.1.1.7.1.8.11-3.jar:?] at org.apache.hadoop.hdfs.DistributedFileSystem$54.doCall(DistributedFileSystem.java:2796) ~[hadoop-hdfs-client-3.1.1.7.1.8.11-3.jar:?] at org.apache.hadoop.hdfs.DistributedFileSystem$54.doCall(DistributedFileSystem.java:2793) ~[hadoop-hdfs-client-3.1.1.7.1.8.11-3.jar:?] at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) ~[hadoop-common-3.1.1.7.1.8.11-3.jar:?] at org.apache.hadoop.hdfs.DistributedFileSystem.getEZForPath(DistributedFileSystem.java:2812) ~[hadoop-hdfs-client-3.1.1.7.1.8.11-3.jar:?] at org.apache.hadoop.hdfs.client.HdfsAdmin.getEncryptionZoneForPath(HdfsAdmin.java:374) ~[hadoop-hdfs-client-3.1.1.7.1.8.11-3.jar:?] at org.apache.hadoop.hive.shims.Hadoop23Shims$HdfsEncryptionShim.getEncryptionZoneForPath(Hadoop23Shims.java:1384) ~[hive-exec-3.1.3000.7.1.8.11-3.jar:3.1.3000.7.1.8.11-3] at org.apache.hadoop.hive.shims.Hadoop23Shims$HdfsEncryptionShim.isPathEncrypted(Hadoop23Shims.java:1379) ~[hive-exec-3.1.3000.7.1.8.11-3.jar:3.1.3000.7.1.8.11-3] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.isPathEncrypted(SemanticAnalyzer.java:2484) ~[hive-exec-3.1.3000.7.1.8.11-3.jar:3.1.3000.7.1.8.11-3] was: Caused by: java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:483) ~[hadoop-hdfs-client-3.1.1.7.1.8.11-3.jar:?] at org.apache.hadoop.hdfs.DFSClient.getEZForPath(DFSClient.java:2771) ~[hadoop-hdfs-client-3.1.1.7.1.8.11-3.jar:?] at org.apache.hadoop.hdfs.DistributedFileSystem$54.doCall(DistributedFileSystem.java:2796) ~[hadoop-hdfs-client-3.1.1.7.1.8.11-3.jar:?] at org.apache.hadoop.hdfs.DistributedFileSystem$54.doCall(DistributedFileSystem.java:2793) ~[hadoop-hdfs-client-3.1.1.7.1.8.11-3.jar:?] at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) ~[hadoop-common-3.1.1.7.1.8.11-3.jar:?] at org.apache.hadoop.hdfs.DistributedFileSystem.getEZForPath(DistributedFileSystem.java:2812) ~[hadoop-hdfs-client-3.1.1.7.1.8.11-3.jar:?] at org.apache.hadoop.hdfs.client.HdfsAdmin.getEncryptionZoneForPath(HdfsAdmin.java:374) ~[hadoop-hdfs-client-3.1.1.7.1.8.11-3.jar:?] at org.apache.hadoop.hive.shims.Hadoop23Shims$HdfsEncryptionShim.getEncryptionZoneForPath(Hadoop23Shims.java:1384) ~[hive-exec-3.1.3000.7.1.8.11-3.jar:3.1.3000.7.1.8.11-3] at org.apache.hadoop.hive.shims.Hadoop23Shims$HdfsEncryptionShim.isPathEncrypted(Hadoop23Shims.java:1379) ~[hive-exec-3.1.3000.7.1.8.11-3.jar:3.1.3000.7.1.8.11-3] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.isPathEncrypted(SemanticAnalyzer.java:2484) ~[hive-exec-3.1.3000.7.1.8.11-3.jar:3.1.3000.7.1.8.11-3] > hive query fails with Filesystem closed error > - > > Key: HIVE-27029 > URL: https://issues.apache.org/jira/browse/HIVE-27029 > Project: Hive > Issue Type: Bug >Reporter: Mahesh Raju Somalaraju >Assignee: Mahesh Raju Somalaraju >Priority: Major > > This Jira is raised to modify/fix the code which is done in part of > *HIVE-26352.* > > we should remove the finally block as this is causing the filesystem close > errors. > String queueName, String userName) throws IOException, InterruptedException { > UserGroupInformation ugi = UserGroupInformation.getCurrentUser(); > try { > ugi.doAs((PrivilegedExceptionAction) () -> { > checkQueueAccessInternal(queueName, userName); > return null; > }); > } {color:#0747a6}*finally {*{color} > {color:#0747a6} *try {*{color} > {color:#0747a6} *FileSystem.closeAllForUGI(ugi);*{color} > } catch (IOException exception) { > LOG.error("Could not clean up file-system handles for UGI: " + ugi, > exception); > } > } > } > > Caused by: java.io.IOException: Filesystem closed > at org
[jira] [Updated] (HIVE-27029) hive query fails with Filesystem closed error
[ https://issues.apache.org/jira/browse/HIVE-27029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahesh Raju Somalaraju updated HIVE-27029: -- Description: Caused by: java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:483) ~[hadoop-hdfs-client-3.1.1.7.1.8.11-3.jar:?] at org.apache.hadoop.hdfs.DFSClient.getEZForPath(DFSClient.java:2771) ~[hadoop-hdfs-client-3.1.1.7.1.8.11-3.jar:?] at org.apache.hadoop.hdfs.DistributedFileSystem$54.doCall(DistributedFileSystem.java:2796) ~[hadoop-hdfs-client-3.1.1.7.1.8.11-3.jar:?] at org.apache.hadoop.hdfs.DistributedFileSystem$54.doCall(DistributedFileSystem.java:2793) ~[hadoop-hdfs-client-3.1.1.7.1.8.11-3.jar:?] at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) ~[hadoop-common-3.1.1.7.1.8.11-3.jar:?] at org.apache.hadoop.hdfs.DistributedFileSystem.getEZForPath(DistributedFileSystem.java:2812) ~[hadoop-hdfs-client-3.1.1.7.1.8.11-3.jar:?] at org.apache.hadoop.hdfs.client.HdfsAdmin.getEncryptionZoneForPath(HdfsAdmin.java:374) ~[hadoop-hdfs-client-3.1.1.7.1.8.11-3.jar:?] at org.apache.hadoop.hive.shims.Hadoop23Shims$HdfsEncryptionShim.getEncryptionZoneForPath(Hadoop23Shims.java:1384) ~[hive-exec-3.1.3000.7.1.8.11-3.jar:3.1.3000.7.1.8.11-3] at org.apache.hadoop.hive.shims.Hadoop23Shims$HdfsEncryptionShim.isPathEncrypted(Hadoop23Shims.java:1379) ~[hive-exec-3.1.3000.7.1.8.11-3.jar:3.1.3000.7.1.8.11-3] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.isPathEncrypted(SemanticAnalyzer.java:2484) ~[hive-exec-3.1.3000.7.1.8.11-3.jar:3.1.3000.7.1.8.11-3] was: Caused by: java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:483) ~[hadoop-hdfs-client-3.1.1.7.1.8.11-3.jar:?] at org.apache.hadoop.hdfs.DFSClient.getEZForPath(DFSClient.java:2771) ~[hadoop-hdfs-client-3.1.1.7.1.8.11-3.jar:?] at org.apache.hadoop.hdfs.DistributedFileSystem$54.doCall(DistributedFileSystem.java:2796) ~[hadoop-hdfs-client-3.1.1.7.1.8.11-3.jar:?] at org.apache.hadoop.hdfs.DistributedFileSystem$54.doCall(DistributedFileSystem.java:2793) ~[hadoop-hdfs-client-3.1.1.7.1.8.11-3.jar:?] at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) ~[hadoop-common-3.1.1.7.1.8.11-3.jar:?] at org.apache.hadoop.hdfs.DistributedFileSystem.getEZForPath(DistributedFileSystem.java:2812) ~[hadoop-hdfs-client-3.1.1.7.1.8.11-3.jar:?] at org.apache.hadoop.hdfs.client.HdfsAdmin.getEncryptionZoneForPath(HdfsAdmin.java:374) ~[hadoop-hdfs-client-3.1.1.7.1.8.11-3.jar:?] at org.apache.hadoop.hive.shims.Hadoop23Shims$HdfsEncryptionShim.getEncryptionZoneForPath(Hadoop23Shims.java:1384) ~[hive-exec-3.1.3000.7.1.8.11-3.jar:3.1.3000.7.1.8.11-3] at org.apache.hadoop.hive.shims.Hadoop23Shims$HdfsEncryptionShim.isPathEncrypted(Hadoop23Shims.java:1379) ~[hive-exec-3.1.3000.7.1.8.11-3.jar:3.1.3000.7.1.8.11-3] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.isPathEncrypted(SemanticAnalyzer.java:2484) ~[hive-exec-3.1.3000.7.1.8.11-3.jar:3.1.3000.7.1.8.11-3] steps: 1. When using explicit queue name the tez.queue.access is used. If statistics gathering is enabled the second insert fails at the compute_stats() phase. beeline --hiveconf tez.queue.name=default -e " SET hive.query.results.cache.enabled=false; SET hive.fetch.task.conversion=none; SET hive.stats.autogather=true; drop table if exists default.bigd35368p100; create table default.bigd35368p100 (name string) partitioned by ( id int); insert into default.bigd35368p100 select * from default.bigd35368e100; drop table if exists default.bigd35368p100; create table default.bigd35368p100 (name string) partitioned by ( id int); insert into default.bigd35368p100 select * from default.bigd35368e100; " > hive query fails with Filesystem closed error > - > > Key: HIVE-27029 > URL: https://issues.apache.org/jira/browse/HIVE-27029 > Project: Hive > Issue Type: Bug >Reporter: Mahesh Raju Somalaraju >Assignee: Mahesh Raju Somalaraju >Priority: Major > > Caused by: java.io.IOException: Filesystem closed > at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:483) > ~[hadoop-hdfs-client-3.1.1.7.1.8.11-3.jar:?] > at org.apache.hadoop.hdfs.DFSClient.getEZForPath(DFSClient.java:2771) > ~[hadoop-hdfs-client-3.1.1.7.1.8.11-3.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$54.doCall(DistributedFileSystem.java:2796) > ~[hadoop-hdfs-client-3.1.1.7.1.8.11-3.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$54.doCall(DistributedFileSystem.java:2793) > ~[hadoop-hdfs-client-3.1.1.7.1.8.11-3.jar:?] > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
[jira] [Assigned] (HIVE-27029) hive query fails with Filesystem closed error
[ https://issues.apache.org/jira/browse/HIVE-27029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahesh Raju Somalaraju reassigned HIVE-27029: - Assignee: Mahesh Raju Somalaraju > hive query fails with Filesystem closed error > - > > Key: HIVE-27029 > URL: https://issues.apache.org/jira/browse/HIVE-27029 > Project: Hive > Issue Type: Bug >Reporter: Mahesh Raju Somalaraju >Assignee: Mahesh Raju Somalaraju >Priority: Major > > Caused by: java.io.IOException: Filesystem closed > at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:483) > ~[hadoop-hdfs-client-3.1.1.7.1.8.11-3.jar:?] > at org.apache.hadoop.hdfs.DFSClient.getEZForPath(DFSClient.java:2771) > ~[hadoop-hdfs-client-3.1.1.7.1.8.11-3.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$54.doCall(DistributedFileSystem.java:2796) > ~[hadoop-hdfs-client-3.1.1.7.1.8.11-3.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$54.doCall(DistributedFileSystem.java:2793) > ~[hadoop-hdfs-client-3.1.1.7.1.8.11-3.jar:?] > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > ~[hadoop-common-3.1.1.7.1.8.11-3.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem.getEZForPath(DistributedFileSystem.java:2812) > ~[hadoop-hdfs-client-3.1.1.7.1.8.11-3.jar:?] > at > org.apache.hadoop.hdfs.client.HdfsAdmin.getEncryptionZoneForPath(HdfsAdmin.java:374) > ~[hadoop-hdfs-client-3.1.1.7.1.8.11-3.jar:?] > at > org.apache.hadoop.hive.shims.Hadoop23Shims$HdfsEncryptionShim.getEncryptionZoneForPath(Hadoop23Shims.java:1384) > ~[hive-exec-3.1.3000.7.1.8.11-3.jar:3.1.3000.7.1.8.11-3] > at > org.apache.hadoop.hive.shims.Hadoop23Shims$HdfsEncryptionShim.isPathEncrypted(Hadoop23Shims.java:1379) > ~[hive-exec-3.1.3000.7.1.8.11-3.jar:3.1.3000.7.1.8.11-3] > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.isPathEncrypted(SemanticAnalyzer.java:2484) > ~[hive-exec-3.1.3000.7.1.8.11-3.jar:3.1.3000.7.1.8.11-3] > > > steps: > 1. When using explicit queue name the tez.queue.access is used. If statistics > gathering is enabled the second insert fails at the compute_stats() phase. > beeline --hiveconf tez.queue.name=default -e " > SET hive.query.results.cache.enabled=false; > SET hive.fetch.task.conversion=none; > SET hive.stats.autogather=true; > drop table if exists default.bigd35368p100; > create table default.bigd35368p100 (name string) partitioned by ( id int); > insert into default.bigd35368p100 select * from default.bigd35368e100; > drop table if exists default.bigd35368p100; > create table default.bigd35368p100 (name string) partitioned by ( id int); > insert into default.bigd35368p100 select * from default.bigd35368e100; > " -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HIVE-26983) Apache Hive website Getting started page showing 404 Error
[ https://issues.apache.org/jira/browse/HIVE-26983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahesh Raju Somalaraju reassigned HIVE-26983: - > Apache Hive website Getting started page showing 404 Error > -- > > Key: HIVE-26983 > URL: https://issues.apache.org/jira/browse/HIVE-26983 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Mahesh Raju Somalaraju >Assignee: Mahesh Raju Somalaraju >Priority: Minor > > [https://hive.apache.org/GettingStarted] When we click this page then we are > getting 404 not found page. Need to check and fix the issue. -- This message was sent by Atlassian Jira (v8.20.10#820010)