[jira] [Updated] (HIVE-27428) CTAS fails with SemanticException when join subquery has complex type column and false filter predicate

2023-06-09 Thread Naresh P R (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naresh P R updated HIVE-27428:
--
Description: 
Repro steps:
{code:java}
drop table if exists table1;
drop table if exists table2;

create table table1 (a string, b string);
create table table2 (complex_column  create table table2 (complex_column 
array, values:array);

-- CTAS failing query
create table table3 as with t1 as (select * from table1), t2 as (select * from 
table2 where 1=0) select t1.*, t2.* from t1 left join t2;{code}
Exception:
{code:java}
Caused by: org.apache.hadoop.hive.ql.parse.SemanticException: 
CREATE-TABLE-AS-SELECT creates a VOID type, please use CAST to specify the 
type, near field:  t2.complex_column
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.deriveFileSinkColTypes(SemanticAnalyzer.java:8171)
 
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.deriveFileSinkColTypes(SemanticAnalyzer.java:8129)
 
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFileSinkPlan(SemanticAnalyzer.java:7822)
 
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:11248)
 
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:11120)
 
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:12050)
 
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11916)
 
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:12730)
 
        at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:722)
 
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12831)
 
        at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:442)
 
        at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:300)
 
        at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:220) 
        at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:105) 
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:194)  {code}

  was:
Repro steps:
{code:java}
drop table if exists table1;
drop table if exists table2;

create table table1 (a string, b string);
create table table2 (complex_column  create table table2 (complex_column 
array, values:array);

-- CTAS failing query
create table table3 as with t1 as (select * from table1), t2 as (select * from 
table2 where 1=0) select t1.*, t2.* from t1 left join t2;{code}
Exception:
{code:java}
Caused by: org.apache.hadoop.hive.ql.parse.SemanticException: 
CREATE-TABLE-AS-SELECT creates a VOID type, please use CAST to specify the 
type, near field:  t2.df0rrd_prod_wers_x
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.deriveFileSinkColTypes(SemanticAnalyzer.java:8171)
 
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.deriveFileSinkColTypes(SemanticAnalyzer.java:8129)
 
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFileSinkPlan(SemanticAnalyzer.java:7822)
 
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:11248)
 
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:11120)
 
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:12050)
 
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11916)
 
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:12730)
 
        at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:722)
 
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12831)
 
        at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:442)
 
        at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:300)
 
        at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:220) 
        at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:105) 
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:194)  {code}


> CTAS fails with SemanticException when join subquery has complex type column 
> and false filter predicate
> ---
>
> Key: HIVE-27428
> URL: https://issues.apache.org/jira/browse/HIVE-27428
> Project: Hive
>  Issue Type: Bug
>Reporter: Naresh P R
>Priority: Major
>
> Repro steps:
> {code:java}
> drop table if exists table1;
> drop table if exists table2;
> create table table1 (a string, b string);
> create table table2 

[jira] [Created] (HIVE-27428) CTAS fails with SemanticException when join subquery has complex type column and false filter predicate

2023-06-09 Thread Naresh P R (Jira)
Naresh P R created HIVE-27428:
-

 Summary: CTAS fails with SemanticException when join subquery has 
complex type column and false filter predicate
 Key: HIVE-27428
 URL: https://issues.apache.org/jira/browse/HIVE-27428
 Project: Hive
  Issue Type: Bug
Reporter: Naresh P R


Repro steps:
{code:java}
drop table if exists table1;
drop table if exists table2;

create table table1 (a string, b string);
create table table2 (complex_column  create table table2 (complex_column 
array, values:array);

-- CTAS failing query
create table table3 as with t1 as (select * from table1), t2 as (select * from 
table2 where 1=0) select t1.*, t2.* from t1 left join t2;{code}
Exception:
{code:java}
Caused by: org.apache.hadoop.hive.ql.parse.SemanticException: 
CREATE-TABLE-AS-SELECT creates a VOID type, please use CAST to specify the 
type, near field:  t2.df0rrd_prod_wers_x
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.deriveFileSinkColTypes(SemanticAnalyzer.java:8171)
 
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.deriveFileSinkColTypes(SemanticAnalyzer.java:8129)
 
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFileSinkPlan(SemanticAnalyzer.java:7822)
 
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:11248)
 
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:11120)
 
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:12050)
 
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11916)
 
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:12730)
 
        at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:722)
 
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12831)
 
        at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:442)
 
        at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:300)
 
        at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:220) 
        at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:105) 
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:194)  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work started] (HIVE-27427) Automatic rerunning of failed tests in Hive Pre-commit job

2023-06-09 Thread Dmitriy Fingerman (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-27427 started by Dmitriy Fingerman.

> Automatic rerunning of failed tests in Hive Pre-commit job
> --
>
> Key: HIVE-27427
> URL: https://issues.apache.org/jira/browse/HIVE-27427
> Project: Hive
>  Issue Type: Improvement
>Reporter: Dmitriy Fingerman
>Assignee: Dmitriy Fingerman
>Priority: Major
>
> It often happens that Hive unit tests fail during pre-commit which requires 
> rerunning the whole pre-commit job and creates hours of delays. Maven has the 
> ability to rerun failed tests. There is the following property in 
> maven-surefire-plugin which can be used for that:
> {code:java}
> rerunFailingTestsCount{code}
>  * [Dev mail 
> discussion|https://lists.apache.org/thread/3vfw9b7wc35vr17zjzk1pq2jrgtkdvrq]
>  * [Rerun Failing 
> Tests|https://maven.apache.org/surefire/maven-surefire-plugin/examples/rerun-failing-tests.html]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27427) Automatic rerunning of failed tests in Hive Pre-commit job

2023-06-09 Thread Dmitriy Fingerman (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Fingerman updated HIVE-27427:
-
Description: 
It often happens that Hive unit tests fail during pre-commit which requires 
rerunning the whole pre-commit job and creates hours of delays. Maven has the 
ability to rerun failed tests. There is the following property in 
maven-surefire-plugin which can be used for that:
{code:java}
rerunFailingTestsCount{code}
 

[Dev mail 
discussion|https://lists.apache.org/thread/3vfw9b7wc35vr17zjzk1pq2jrgtkdvrq]

[Rerun Failing 
Tests|https://maven.apache.org/surefire/maven-surefire-plugin/examples/rerun-failing-tests.html]

 

  was:
It often happens that Hive unit tests fail during pre-commit which requires 
rerunning the whole pre-commit job and creates hours of delays. Maven has the 
ability to rerun failed tests. There is the following property in 
maven-surefire-plugin which can be used for that:
{code:java}
rerunFailingTestsCount{code}
 

[Dev mail 
discussion|https://lists.apache.org/thread/3vfw9b7wc35vr17zjzk1pq2jrgtkdvrq]

[Rerun Failing 
Tests|[http://example.com|https://maven.apache.org/surefire/maven-surefire-plugin/examples/rerun-failing-tests.html]]

 


> Automatic rerunning of failed tests in Hive Pre-commit job
> --
>
> Key: HIVE-27427
> URL: https://issues.apache.org/jira/browse/HIVE-27427
> Project: Hive
>  Issue Type: Improvement
>Reporter: Dmitriy Fingerman
>Assignee: Dmitriy Fingerman
>Priority: Major
>
> It often happens that Hive unit tests fail during pre-commit which requires 
> rerunning the whole pre-commit job and creates hours of delays. Maven has the 
> ability to rerun failed tests. There is the following property in 
> maven-surefire-plugin which can be used for that:
> {code:java}
> rerunFailingTestsCount{code}
>  
> [Dev mail 
> discussion|https://lists.apache.org/thread/3vfw9b7wc35vr17zjzk1pq2jrgtkdvrq]
> [Rerun Failing 
> Tests|https://maven.apache.org/surefire/maven-surefire-plugin/examples/rerun-failing-tests.html]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27427) Automatic rerunning of failed tests in Hive Pre-commit job

2023-06-09 Thread Dmitriy Fingerman (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Fingerman updated HIVE-27427:
-
Description: 
It often happens that Hive unit tests fail during pre-commit which requires 
rerunning the whole pre-commit job and creates hours of delays. Maven has the 
ability to rerun failed tests. There is the following property in 
maven-surefire-plugin which can be used for that:
{code:java}
rerunFailingTestsCount{code}
 * [Dev mail 
discussion|https://lists.apache.org/thread/3vfw9b7wc35vr17zjzk1pq2jrgtkdvrq]
 * [Rerun Failing 
Tests|https://maven.apache.org/surefire/maven-surefire-plugin/examples/rerun-failing-tests.html]

 

  was:
It often happens that Hive unit tests fail during pre-commit which requires 
rerunning the whole pre-commit job and creates hours of delays. Maven has the 
ability to rerun failed tests. There is the following property in 
maven-surefire-plugin which can be used for that:
{code:java}
rerunFailingTestsCount{code}
 

[Dev mail 
discussion|https://lists.apache.org/thread/3vfw9b7wc35vr17zjzk1pq2jrgtkdvrq]

[Rerun Failing 
Tests|https://maven.apache.org/surefire/maven-surefire-plugin/examples/rerun-failing-tests.html]

 


> Automatic rerunning of failed tests in Hive Pre-commit job
> --
>
> Key: HIVE-27427
> URL: https://issues.apache.org/jira/browse/HIVE-27427
> Project: Hive
>  Issue Type: Improvement
>Reporter: Dmitriy Fingerman
>Assignee: Dmitriy Fingerman
>Priority: Major
>
> It often happens that Hive unit tests fail during pre-commit which requires 
> rerunning the whole pre-commit job and creates hours of delays. Maven has the 
> ability to rerun failed tests. There is the following property in 
> maven-surefire-plugin which can be used for that:
> {code:java}
> rerunFailingTestsCount{code}
>  * [Dev mail 
> discussion|https://lists.apache.org/thread/3vfw9b7wc35vr17zjzk1pq2jrgtkdvrq]
>  * [Rerun Failing 
> Tests|https://maven.apache.org/surefire/maven-surefire-plugin/examples/rerun-failing-tests.html]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27427) Automatic rerunning of failed tests in Hive Pre-commit job

2023-06-09 Thread Dmitriy Fingerman (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Fingerman updated HIVE-27427:
-
Description: 
It often happens that Hive unit tests fail during pre-commit which requires 
rerunning the whole pre-commit job and creates hours of delays. Maven has the 
ability to rerun failed tests. There is the following property in 
maven-surefire-plugin which can be used for that:
{code:java}
rerunFailingTestsCount{code}
 

[Dev mail 
discussion|https://lists.apache.org/thread/3vfw9b7wc35vr17zjzk1pq2jrgtkdvrq]

[Rerun Failing 
Tests|[http://example.com|https://maven.apache.org/surefire/maven-surefire-plugin/examples/rerun-failing-tests.html]]

 

  was:
It often happens that Hive unit tests fail during pre-commit which requires 
rerunning the whole pre-commit job and creates hours of delays. What if we set 
Maven config to retry failed tests automatically X times? There is 
"rerunFailingTestsCount" property in maven-surefire-plugin which can be used 
for that. I would like to hear the feedback and if it is positive I could open 
a JIRA ticket and work on it.

[Dev mail 
discussion|https://lists.apache.org/thread/3vfw9b7wc35vr17zjzk1pq2jrgtkdvrq]

 

 


> Automatic rerunning of failed tests in Hive Pre-commit job
> --
>
> Key: HIVE-27427
> URL: https://issues.apache.org/jira/browse/HIVE-27427
> Project: Hive
>  Issue Type: Improvement
>Reporter: Dmitriy Fingerman
>Assignee: Dmitriy Fingerman
>Priority: Major
>
> It often happens that Hive unit tests fail during pre-commit which requires 
> rerunning the whole pre-commit job and creates hours of delays. Maven has the 
> ability to rerun failed tests. There is the following property in 
> maven-surefire-plugin which can be used for that:
> {code:java}
> rerunFailingTestsCount{code}
>  
> [Dev mail 
> discussion|https://lists.apache.org/thread/3vfw9b7wc35vr17zjzk1pq2jrgtkdvrq]
> [Rerun Failing 
> Tests|[http://example.com|https://maven.apache.org/surefire/maven-surefire-plugin/examples/rerun-failing-tests.html]]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27427) Automatic rerunning of failed tests in Hive Pre-commit job

2023-06-09 Thread Dmitriy Fingerman (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Fingerman updated HIVE-27427:
-
Description: 
It often happens that Hive unit tests fail during pre-commit which requires 
rerunning the whole pre-commit job and creates hours of delays. What if we set 
Maven config to retry failed tests automatically X times? There is 
"rerunFailingTestsCount" property in maven-surefire-plugin which can be used 
for that. I would like to hear the feedback and if it is positive I could open 
a JIRA ticket and work on it.

[Dev mail 
discussion|https://lists.apache.org/thread/3vfw9b7wc35vr17zjzk1pq2jrgtkdvrq]

 

 

  was:
It often happens that Hive unit tests fail during pre-commit which requires 
rerunning the whole pre-commit job and creates hours of delays. What if we set 
Maven config to retry failed tests automatically X times? There is 
"rerunFailingTestsCount" property in maven-surefire-plugin which can be used 
for that. I would like to hear the feedback and if it is positive I could open 
a JIRA ticket and work on it.

 

[Dev mail 
discussion|https://lists.apache.org/thread/3vfw9b7wc35vr17zjzk1pq2jrgtkdvrq]


> Automatic rerunning of failed tests in Hive Pre-commit job
> --
>
> Key: HIVE-27427
> URL: https://issues.apache.org/jira/browse/HIVE-27427
> Project: Hive
>  Issue Type: Improvement
>Reporter: Dmitriy Fingerman
>Assignee: Dmitriy Fingerman
>Priority: Major
>
> It often happens that Hive unit tests fail during pre-commit which requires 
> rerunning the whole pre-commit job and creates hours of delays. What if we 
> set Maven config to retry failed tests automatically X times? There is 
> "rerunFailingTestsCount" property in maven-surefire-plugin which can be used 
> for that. I would like to hear the feedback and if it is positive I could 
> open a JIRA ticket and work on it.
> [Dev mail 
> discussion|https://lists.apache.org/thread/3vfw9b7wc35vr17zjzk1pq2jrgtkdvrq]
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27427) Automatic rerunning of failed tests in Hive Pre-commit job

2023-06-09 Thread Dmitriy Fingerman (Jira)
Dmitriy Fingerman created HIVE-27427:


 Summary: Automatic rerunning of failed tests in Hive Pre-commit job
 Key: HIVE-27427
 URL: https://issues.apache.org/jira/browse/HIVE-27427
 Project: Hive
  Issue Type: Improvement
Reporter: Dmitriy Fingerman
Assignee: Dmitriy Fingerman


It often happens that Hive unit tests fail during pre-commit which requires 
rerunning the whole pre-commit job and creates hours of delays. What if we set 
Maven config to retry failed tests automatically X times? There is 
"rerunFailingTestsCount" property in maven-surefire-plugin which can be used 
for that. I would like to hear the feedback and if it is positive I could open 
a JIRA ticket and work on it.

 

[Dev mail 
discussion|https://lists.apache.org/thread/3vfw9b7wc35vr17zjzk1pq2jrgtkdvrq]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-27293) Vectorization: Incorrect results with nvl for ORC table

2023-06-09 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan resolved HIVE-27293.
-
Resolution: Fixed

> Vectorization: Incorrect results with nvl for ORC table
> ---
>
> Key: HIVE-27293
> URL: https://issues.apache.org/jira/browse/HIVE-27293
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0-alpha-2
>Reporter: Riju Trivedi
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Attachments: esource.txt, vectorization_nvl.q
>
>
> Attached repro.q file and data file used to reproduce the issue.
> {code:java}
> Insert overwrite table etarget
> select mt.*, floor(rand() * 1) as bdata_no from (select nvl(np.client_id,' 
> '),nvl(np.id_enddate,cast(0 as decimal(10,0))),nvl(np.client_gender,' 
> '),nvl(np.birthday,cast(0 as decimal(10,0))),nvl(np.nationality,' 
> '),nvl(np.address_zipcode,' '),nvl(np.income,cast(0 as 
> decimal(15,2))),nvl(np.address,' '),nvl(np.part_date,cast(0 as int)) from 
> (select * from esource where part_date = 20230414) np) mt;
>  {code}
> Outcome:
> {code:java}
> select client_id,birthday,income from etarget; 
> 15678   0  0.00
> 67891  19313  -1.00
> 12345  0  0.00{code}
> Expected Result :
> {code:java}
> select client_id,birthday,income from etarget; 
> 12345 19613 -1.00
> 67891 19313 -1.00 
> 15678 0  0.00{code}
> Disabling hive.vectorized.use.vectorized.input.format produces correct output.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27425) Upgrade Nimbus-JOSE-JWT to 9.24+ due to CVEs coming from json-smart

2023-06-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-27425:
--
Labels: pull-request-available  (was: )

> Upgrade Nimbus-JOSE-JWT to 9.24+ due to CVEs coming from json-smart
> ---
>
> Key: HIVE-27425
> URL: https://issues.apache.org/jira/browse/HIVE-27425
> Project: Hive
>  Issue Type: Task
>Reporter: Devaspati Krishnatri
>Assignee: Devaspati Krishnatri
>Priority: Major
>  Labels: pull-request-available
>
> Nimbus-JOSE-JWT before 9.24 is using the vulnerable version of json-smart. 
> nimbus-jose-jwt has dropped the json-smart dependency completely with 
> nimbus-jose-jwt 9.24 and replaces it with *Gson 2.9.1 (shaded),* as seen in 
> the commit history here: 
> [https://bitbucket.org/connect2id/nimbus-jose-jwt/commits/tag/9.24].
> Json-smart before 2.4.9 is affected by CVE-2023-1370
> CVE-2023-1370 - [Json-smart]([https://netplex.github.io/json-smart/]) is a 
> performance focused, JSON processor lib. When reaching a '[' or '{' character 
> in the JSON input, the code parses an array or an object respectively. It was 
> discovered that the code does not have any limit to the nesting of such 
> arrays or objects. Since the parsing of nested arrays and objects is done 
> recursively, nesting too many of them can cause a stack exhaustion (stack 
> overflow) and crash the software.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-27418) UNION ALL + ORDER BY ordinal works incorrectly for all const queries

2023-06-09 Thread zhangbutao (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17730898#comment-17730898
 ] 

zhangbutao commented on HIVE-27418:
---

Hi [~csringhofer] , could you provide more info about your hive cluster env? 
e.g. Hive & Hadoop  version.

And what execution engine did you use for the test? Tez? or MR?

> UNION ALL + ORDER BY ordinal works incorrectly for all const queries
> 
>
> Key: HIVE-27418
> URL: https://issues.apache.org/jira/browse/HIVE-27418
> Project: Hive
>  Issue Type: Bug
>Reporter: Csaba Ringhofer
>Priority: Major
>
> For the following query I get results in wrong order:
> SELECT '1', 'b' UNION ALL SELECT '2', 'a'  ORDER BY 2;
> +--+--+
> | _c0  | _c1  |
> +--+--+
> | 1| b|
> | 2| a|
> +--+--+
> I get correct results if:
> - the column has an alias
> - the same rows come from tables
> - the UNION ALL part of the query is in a sub-query and ORDER BY is run on 
> the sub*query
>  Checked with postgres and Apache Impala and they apply ORDER BY correctly.
> (also noted the the ordinal after ORDER BY is not checked, so it could be 20 
> and Hive doesn't complain)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-27332) Add retry backoff mechanism for abort cleanup

2023-06-09 Thread Sourabh Badhya (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17730892#comment-17730892
 ] 

Sourabh Badhya commented on HIVE-27332:
---

Thanks [~veghlaci05] and [~dkuzmenko] for the reviews.

> Add retry backoff mechanism for abort cleanup
> -
>
> Key: HIVE-27332
> URL: https://issues.apache.org/jira/browse/HIVE-27332
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> HIVE-27019 and HIVE-27020 added the functionality to directly clean data 
> directories from aborted transactions without using Initiator & Worker. 
> However, during the event of continuous failure during cleanup, the retry 
> mechanism is initiated every single time. We need to add retry backoff 
> mechanism to control the time required to initiate retry again and not 
> continuously retry.
> There are widely 3 cases wherein retry due to abort cleanup is impacted - 
> *1. Abort cleanup on the table failed + Compaction on the table failed.*
> *2. Abort cleanup on the table failed + Compaction on the table passed*
> *3. Abort cleanup on the table failed + No compaction on the table.*
> *Solution -* 
> *We reuse COMPACTION_QUEUE table to store the retry metadata -* 
> *Advantage: Most of the fields with respect to retry are present in 
> COMPACTION_QUEUE. Hence we can use the same for storing retry metadata. A 
> compaction type called ABORT_CLEANUP ('c') is introduced. The CQ_STATE will 
> remain ready for cleaning for such records.*
> *Actions performed by TaskHandler in the case of failure -* 
> *AbortTxnCleaner -* 
> Action: Just add retry details in the queue table during the abort failure.
> *CompactionCleaner -* 
> Action: If compaction on the same table is successful, delete the retry entry 
> in markCleaned when removing any TXN_COMPONENTS entries except when there are 
> no uncompacted aborts. We do not want to be in a situation where there is a 
> queue entry for a table but there is no record in TXN_COMPONENTS associated 
> with the same table.
> *Advantage: Expecting no performance issues with this approach. Since we 
> delete 1 record most of the times for the associated table/partition.*



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-27332) Add retry backoff mechanism for abort cleanup

2023-06-09 Thread Sourabh Badhya (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sourabh Badhya resolved HIVE-27332.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

> Add retry backoff mechanism for abort cleanup
> -
>
> Key: HIVE-27332
> URL: https://issues.apache.org/jira/browse/HIVE-27332
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> HIVE-27019 and HIVE-27020 added the functionality to directly clean data 
> directories from aborted transactions without using Initiator & Worker. 
> However, during the event of continuous failure during cleanup, the retry 
> mechanism is initiated every single time. We need to add retry backoff 
> mechanism to control the time required to initiate retry again and not 
> continuously retry.
> There are widely 3 cases wherein retry due to abort cleanup is impacted - 
> *1. Abort cleanup on the table failed + Compaction on the table failed.*
> *2. Abort cleanup on the table failed + Compaction on the table passed*
> *3. Abort cleanup on the table failed + No compaction on the table.*
> *Solution -* 
> *We reuse COMPACTION_QUEUE table to store the retry metadata -* 
> *Advantage: Most of the fields with respect to retry are present in 
> COMPACTION_QUEUE. Hence we can use the same for storing retry metadata. A 
> compaction type called ABORT_CLEANUP ('c') is introduced. The CQ_STATE will 
> remain ready for cleaning for such records.*
> *Actions performed by TaskHandler in the case of failure -* 
> *AbortTxnCleaner -* 
> Action: Just add retry details in the queue table during the abort failure.
> *CompactionCleaner -* 
> Action: If compaction on the same table is successful, delete the retry entry 
> in markCleaned when removing any TXN_COMPONENTS entries except when there are 
> no uncompacted aborts. We do not want to be in a situation where there is a 
> queue entry for a table but there is no record in TXN_COMPONENTS associated 
> with the same table.
> *Advantage: Expecting no performance issues with this approach. Since we 
> delete 1 record most of the times for the associated table/partition.*



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-27018) Move aborted transaction cleanup outside compaction process

2023-06-09 Thread Sourabh Badhya (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sourabh Badhya resolved HIVE-27018.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

>  Move aborted transaction cleanup outside compaction process
> 
>
> Key: HIVE-27018
> URL: https://issues.apache.org/jira/browse/HIVE-27018
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
> Fix For: 4.0.0
>
>
> Aborted transactions processing is tightly integrated into the compaction 
> pipeline and consists of 3 main stages: Initiator, Compactor (Worker), 
> Cleaner. That could be simplified by doing all work on the Cleaner side.
> *Potential Benefits -* 
> There are major advantages of implementing this on the cleaner side - 
>  1) Currently an aborted txn in the TXNS table blocks the cleaning of 
> TXN_TO_WRITE_ID table since nothing gets cleaned above MIN(aborted txnid) in 
> the current implementation. After implementing this on the cleaner side, the 
> cleaner regularly checks and cleans the aborted records in the TXN_COMPONENTS 
> table, which in turn makes the AcidTxnCleanerService clean the aborted txns 
> in TXNS table.
>  2) Initiator and worker do not do anything on tables which contain only 
> aborted directories. It's the cleaner which removes the aborted directories 
> of the table. Hence all operations associated with the initiator and worker 
> for these tables are wasteful. These wasteful operations are avoided.
> 3) DP writes which are aborted are skipped by the worker currently. Hence 
> once again the cleaner is the one deleting the aborted directories. All 
> operations associated with the initiator and worker for this entry are 
> wasteful. These wasteful operations are avoided.
> *Proposed solution -* 
> *Implement logic to handle aborted transactions exclusively in Cleaner.*
> Implement logic to fetch the TXN_COMPONENTS which are associated with 
> transactions in aborted state and send the required information to the 
> cleaner. Cleaner must clean up the aborted deltas/delete deltas by using the 
> aborted directories in the AcidState of the table/partition.
> It is also better to separate entities which provide information of 
> compaction and abort cleanup to enhance code modularity. This can be done in 
> this way -
> Cleaner can be divided into separate entities like - 
> *1) Handler* - This entity fetches the data from the metastore DB from 
> relevant tables and converts it into a request entity called CleaningRequest. 
> It would also do SQL operations post cleanup (postprocess). Every type of 
> cleaning request is provided by a separate handler.
> *2) Filesystem remover* - This entity fetches the cleaning requests from 
> various handlers and deletes them according to the cleaning request.
> *This division allows for dynamic extensibility of cleanup from multiple 
> handlers. Every handler is responsible for providing cleaning requests from a 
> specific source.*
> The following solution is resilient i.e. in the event of abrupt metastore 
> shutdown, the cleaner can still see the relevant entries in the metastore DB 
> and retry the cleaning task for that entry.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27426) Upgrade kryo version in iceberg module

2023-06-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-27426:
--
Labels: pull-request-available  (was: )

> Upgrade kryo version in iceberg module
> --
>
> Key: HIVE-27426
> URL: https://issues.apache.org/jira/browse/HIVE-27426
> Project: Hive
>  Issue Type: Task
>Reporter: Devaspati Krishnatri
>Assignee: Devaspati Krishnatri
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27426) Upgrade kryo version in iceberg module

2023-06-09 Thread Devaspati Krishnatri (Jira)
Devaspati Krishnatri created HIVE-27426:
---

 Summary: Upgrade kryo version in iceberg module
 Key: HIVE-27426
 URL: https://issues.apache.org/jira/browse/HIVE-27426
 Project: Hive
  Issue Type: Task
Reporter: Devaspati Krishnatri
Assignee: Devaspati Krishnatri






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27340) ThreadPool in HS2 over HTTP should respect the customized ThreadFactory

2023-06-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-27340:
--
Labels: pull-request-available  (was: )

> ThreadPool in HS2 over HTTP should respect the customized ThreadFactory
> ---
>
> Key: HIVE-27340
> URL: https://issues.apache.org/jira/browse/HIVE-27340
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Minor
>  Labels: pull-request-available
>
> In Jetty, ExecutorThreadPool will override the ThreadFactory of 
> ThreadPoolExecutor even though the ThreadPoolExecutor has already initialized 
> the ThreadFactory, 
> {code:java}
> _executor.setThreadFactory(this::newThread); {code}
> Need to ignore such action as we have injected a ThreadFactory into the 
> ThreadPoolExecutor.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27425) Upgrade Nimbus-JOSE-JWT to 9.24+ due to CVEs coming from json-smart

2023-06-09 Thread Devaspati Krishnatri (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaspati Krishnatri updated HIVE-27425:

Summary: Upgrade Nimbus-JOSE-JWT to 9.24+ due to CVEs coming from 
json-smart  (was: Upgrade Nimbus-JOSE-JWT to 9.24 due to CVEs coming from 
json-smart)

> Upgrade Nimbus-JOSE-JWT to 9.24+ due to CVEs coming from json-smart
> ---
>
> Key: HIVE-27425
> URL: https://issues.apache.org/jira/browse/HIVE-27425
> Project: Hive
>  Issue Type: Task
>Reporter: Devaspati Krishnatri
>Assignee: Devaspati Krishnatri
>Priority: Major
>
> Nimbus-JOSE-JWT before 9.24 is using the vulnerable version of json-smart. 
> nimbus-jose-jwt has dropped the json-smart dependency completely with 
> nimbus-jose-jwt 9.24 and replaces it with *Gson 2.9.1 (shaded),* as seen in 
> the commit history here: 
> [https://bitbucket.org/connect2id/nimbus-jose-jwt/commits/tag/9.24].
> Json-smart before 2.4.9 is affected by CVE-2023-1370
> CVE-2023-1370 - [Json-smart]([https://netplex.github.io/json-smart/]) is a 
> performance focused, JSON processor lib. When reaching a '[' or '{' character 
> in the JSON input, the code parses an array or an object respectively. It was 
> discovered that the code does not have any limit to the nesting of such 
> arrays or objects. Since the parsing of nested arrays and objects is done 
> recursively, nesting too many of them can cause a stack exhaustion (stack 
> overflow) and crash the software.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27425) Upgrade Nimbus-JOSE-JWT to 9.24 due to CVEs coming from json-smart

2023-06-09 Thread Devaspati Krishnatri (Jira)
Devaspati Krishnatri created HIVE-27425:
---

 Summary: Upgrade Nimbus-JOSE-JWT to 9.24 due to CVEs coming from 
json-smart
 Key: HIVE-27425
 URL: https://issues.apache.org/jira/browse/HIVE-27425
 Project: Hive
  Issue Type: Task
Reporter: Devaspati Krishnatri
Assignee: Devaspati Krishnatri


Nimbus-JOSE-JWT before 9.24 is using the vulnerable version of json-smart. 

nimbus-jose-jwt has dropped the json-smart dependency completely with 
nimbus-jose-jwt 9.24 and replaces it with *Gson 2.9.1 (shaded),* as seen in the 
commit history here: 
[https://bitbucket.org/connect2id/nimbus-jose-jwt/commits/tag/9.24].

Json-smart before 2.4.9 is affected by CVE-2023-1370

CVE-2023-1370 - [Json-smart]([https://netplex.github.io/json-smart/]) is a 
performance focused, JSON processor lib. When reaching a '[' or '{' character 
in the JSON input, the code parses an array or an object respectively. It was 
discovered that the code does not have any limit to the nesting of such arrays 
or objects. Since the parsing of nested arrays and objects is done recursively, 
nesting too many of them can cause a stack exhaustion (stack overflow) and 
crash the software.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-27424) Add mvn dependency:tree run in github actions

2023-06-09 Thread Akshat Mathur (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akshat Mathur reassigned HIVE-27424:


Assignee: Akshat Mathur

> Add mvn dependency:tree run in github actions
> -
>
> Key: HIVE-27424
> URL: https://issues.apache.org/jira/browse/HIVE-27424
> Project: Hive
>  Issue Type: Improvement
>Reporter: Akshat Mathur
>Assignee: Akshat Mathur
>Priority: Major
>
> From the discussion on [#4396|https://github.com/apache/hive/pull/4396]
> Run mvn dependency:tree in github actions



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27424) Add mvn dependency:tree run in github actions

2023-06-09 Thread Akshat Mathur (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akshat Mathur updated HIVE-27424:
-
Description: 
>From the discussion on [#4396|https://github.com/apache/hive/pull/4396]

Run mvn dependency:tree in github actions

  was:
>From the discussion on [https://github.com/apache/hive/pull/4396|#4396]

Run mvn dependency:tree in github actions


> Add mvn dependency:tree run in github actions
> -
>
> Key: HIVE-27424
> URL: https://issues.apache.org/jira/browse/HIVE-27424
> Project: Hive
>  Issue Type: Improvement
>Reporter: Akshat Mathur
>Priority: Major
>
> From the discussion on [#4396|https://github.com/apache/hive/pull/4396]
> Run mvn dependency:tree in github actions



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27424) Add mvn dependency:tree run in github actions

2023-06-09 Thread Akshat Mathur (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akshat Mathur updated HIVE-27424:
-
Description: 
>From the discussion on [https://github.com/apache/hive/pull/4396|#4396]

Run mvn dependency:tree in github actions

  was:
>From the discussion on [#https://github.com/apache/hive/pull/4396] 

Run mvn dependency:tree in github actions


> Add mvn dependency:tree run in github actions
> -
>
> Key: HIVE-27424
> URL: https://issues.apache.org/jira/browse/HIVE-27424
> Project: Hive
>  Issue Type: Improvement
>Reporter: Akshat Mathur
>Priority: Major
>
> From the discussion on [https://github.com/apache/hive/pull/4396|#4396]
> Run mvn dependency:tree in github actions



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27424) Add mvn dependency:tree run in github actions

2023-06-09 Thread Akshat Mathur (Jira)
Akshat Mathur created HIVE-27424:


 Summary: Add mvn dependency:tree run in github actions
 Key: HIVE-27424
 URL: https://issues.apache.org/jira/browse/HIVE-27424
 Project: Hive
  Issue Type: Improvement
Reporter: Akshat Mathur


>From the discussion on [#https://github.com/apache/hive/pull/4396] 

Run mvn dependency:tree in github actions



--
This message was sent by Atlassian Jira
(v8.20.10#820010)