date:20200901

[jira] [Commented] (HIVE-24048) Harmonise Jackson components to version 2.10.latest - Hive

2020-09-01 Thread Sai Hemanth Gantasala (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17188909#comment-17188909
 ] 

Sai Hemanth Gantasala commented on HIVE-24048:
--

Yes That is the correct PR.

> Harmonise Jackson components to version 2.10.latest - Hive
> --
>
> Key: HIVE-24048
> URL: https://issues.apache.org/jira/browse/HIVE-24048
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Sai Hemanth Gantasala
>Assignee: Sai Hemanth Gantasala
>Priority: Major
>
> Hive uses the following jackson components not harmonised with 
> jackson-databind's version (2.10.0)
>  * jackson-dataformat-yaml 2.9.8
>  * jackson-jaxrs-base 2.9.8
> To avoid conflicts caused by version mismatches please harmonise it with 
> jackson-databind's version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-24048) Harmonise Jackson components to version 2.10.latest - Hive

2020-09-01 Thread Kevin Risden (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17188906#comment-17188906
 ] 

Kevin Risden commented on HIVE-24048:
-

[~hemanth619] it looks like the PR for this is 
https://github.com/apache/hive/pull/1411 is that correct?

> Harmonise Jackson components to version 2.10.latest - Hive
> --
>
> Key: HIVE-24048
> URL: https://issues.apache.org/jira/browse/HIVE-24048
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Sai Hemanth Gantasala
>Assignee: Sai Hemanth Gantasala
>Priority: Major
>
> Hive uses the following jackson components not harmonised with 
> jackson-databind's version (2.10.0)
>  * jackson-dataformat-yaml 2.9.8
>  * jackson-jaxrs-base 2.9.8
> To avoid conflicts caused by version mismatches please harmonise it with 
> jackson-databind's version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22030) Bumping jackson version to 2.9.9 and 2.9.9.3 (jackson-databind)

2020-09-01 Thread Kevin Risden (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17188905#comment-17188905
 ] 

Kevin Risden commented on HIVE-22030:
-

[~adombi] looks like this was superseded by HIVE-23338

> Bumping jackson version to 2.9.9 and 2.9.9.3 (jackson-databind)
> ---
>
> Key: HIVE-22030
> URL: https://issues.apache.org/jira/browse/HIVE-22030
> Project: Hive
>  Issue Type: Task
>Reporter: Akos Dombi
>Assignee: Akos Dombi
>Priority: Major
> Fix For: 4.0.0
>
>
> Bump the following jackson versions:
>   - jackson version to 2.9.9
>   - jackson-databind version to 2.9.9.3



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-21498) Upgrade Thrift to 0.13.0

2020-09-01 Thread Naveen Gangam (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam resolved HIVE-21498.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Fix has been committed to master for 4.0.0. Thank you for all the work on this 
[~hemanth619]. 
Hive is now using Apache Thrift 0.13.0.

> Upgrade Thrift to 0.13.0
> 
>
> Key: HIVE-21498
> URL: https://issues.apache.org/jira/browse/HIVE-21498
> Project: Hive
>  Issue Type: Bug
>  Components: Thrift API
>Reporter: Ajith S
>Assignee: Sai Hemanth Gantasala
>Priority: Critical
>  Labels: security
> Fix For: 4.0.0
>
>
> Upgrade to consider security fixes.
> Especially https://issues.apache.org/jira/browse/THRIFT-4506



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-21498) Upgrade Thrift to 0.13.0

2020-09-01 Thread Naveen Gangam (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-21498:
-
Summary: Upgrade Thrift to 0.13.0  (was: Upgrade Thrift to 0.12.0)

> Upgrade Thrift to 0.13.0
> 
>
> Key: HIVE-21498
> URL: https://issues.apache.org/jira/browse/HIVE-21498
> Project: Hive
>  Issue Type: Bug
>  Components: Thrift API
>Reporter: Ajith S
>Assignee: Sai Hemanth Gantasala
>Priority: Critical
>  Labels: security
>
> Upgrade to consider security fixes.
> Especially https://issues.apache.org/jira/browse/THRIFT-4506



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-21313) Use faster function to point to instead of copy immutable byte arrays

2020-09-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21313?focusedWorklogId=477547=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-477547
 ]

ASF GitHub Bot logged work on HIVE-21313:
-

Author: ASF GitHub Bot
Created on: 02/Sep/20 00:41
Start Date: 02/Sep/20 00:41
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #548:
URL: https://github.com/apache/hive/pull/548#issuecomment-685209614


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 477547)
Time Spent: 50m  (was: 40m)

> Use faster function to point to instead of copy immutable byte arrays
> -
>
> Key: HIVE-21313
> URL: https://issues.apache.org/jira/browse/HIVE-21313
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: All Versions
>Reporter: ZhangXin
>Assignee: ZhangXin
>Priority: Minor
>  Labels: pull-request-available
> Fix For: All Versions
>
> Attachments: HIVE-21313.patch, HIVE-21313.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> In file ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorAssignRow.java
> We may find code like this:
> ```
> Text text = (Text) convertTargetWritable;
>  if (text == null)
> {     text = new Text(); }
> text.set(string);
>  ((BytesColumnVector) columnVector).setVal(
>      batchIndex, text.getBytes(), 0, text.getLength());
> ```
>  
> Using `setVal` method can copy the bytes array generated by 
> `text.getBytes()`. This is totally unnecessary at all. Since the bytes array 
> is immutable, we can just use `setRef` method to point to the specific  byte 
> array, which will also lower the memory usage.
>  
> Pull request on Github:  https://github.com/apache/hive/pull/548
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23721) MetaStoreDirectSql.ensureDbInit() need to optimize QuerySQL

2020-09-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23721?focusedWorklogId=477545=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-477545
 ]

ASF GitHub Bot logged work on HIVE-23721:
-

Author: ASF GitHub Bot
Created on: 02/Sep/20 00:41
Start Date: 02/Sep/20 00:41
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1202:
URL: https://github.com/apache/hive/pull/1202#issuecomment-685209535


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 477545)
Time Spent: 0.5h  (was: 20m)

> MetaStoreDirectSql.ensureDbInit() need to optimize QuerySQL
> ---
>
> Key: HIVE-23721
> URL: https://issues.apache.org/jira/browse/HIVE-23721
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0, 3.1.2
> Environment: Hadoop 3.1（1700+ nodes）
> YARN 3.1 （with timelineserver enabled，https enabled)
> Hive 3.1 (15 HS2 instance)
> 6+ YARN Applications every day
>Reporter: YulongZ
>Assignee: zhangbutao
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-23721.01.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> From Hive3.0，catalog added to hivemeta，many schema of metastore added column 
> “catName”，and index for table added column “catName”。
> In MetaStoreDirectSql.ensureDbInit() ，two queries below
> “
>   initQueries.add(pm.newQuery(MTableColumnStatistics.class, "dbName == 
> ''"));
>   initQueries.add(pm.newQuery(MPartitionColumnStatistics.class, "dbName 
> == ''"));
> ”
> should use "catName == ''" instead of "dbName == ''"，because “catName” is the 
> first index column。
> When  data of metastore become large，for example， table of 
> MPartitionColumnStatistics have millions of lines。The 
> “newQuery(MPartitionColumnStatistics.class, "dbName == ''")” for metastore 
> executed very slowly，and the query “show tables“ for hiveserver2 executed 
> very slowly too。



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23698) Compiler support for row-level filtering on filterPredicates

2020-09-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23698?focusedWorklogId=477546=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-477546
 ]

ASF GitHub Bot logged work on HIVE-23698:
-

Author: ASF GitHub Bot
Created on: 02/Sep/20 00:41
Start Date: 02/Sep/20 00:41
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #:
URL: https://github.com/apache/hive/pull/#issuecomment-685209559


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 477546)
Time Spent: 0.5h  (was: 20m)

> Compiler support for row-level filtering on filterPredicates
> 
>
> Key: HIVE-23698
> URL: https://issues.apache.org/jira/browse/HIVE-23698
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Similar to what we currently do for StorageHandlers, we should pushdown the 
> static expression for row-level filtering when the file-format supports the 
> feature (ORC).
> I propose to split the  filterExpr to residual and pushed predicate. If 
> predicate is completely pushed then we remove the operator.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23074) SchemaTool sql script execution errors when updating the metadata's schema

2020-09-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23074?focusedWorklogId=477548=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-477548
 ]

ASF GitHub Bot logged work on HIVE-23074:
-

Author: ASF GitHub Bot
Created on: 02/Sep/20 00:41
Start Date: 02/Sep/20 00:41
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #967:
URL: https://github.com/apache/hive/pull/967#issuecomment-685209595


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 477548)
Time Spent: 1h 20m  (was: 1h 10m)

> SchemaTool sql script execution errors when updating the metadata's schema
> --
>
> Key: HIVE-23074
> URL: https://issues.apache.org/jira/browse/HIVE-23074
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.1.2
> Environment: running machine: centos7.2 
> metadata db: PostgreSQL 11.3 on x86_64-pc-linux-gnu
> hive version: upgrade from version 3.0.0 to 3.1.2
>Reporter: John1Tang
>Assignee: John1Tang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.2
>
>   Original Estimate: 1h
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> SchemaTool sql script executed with conflicts on indices and columns and 
> missed " for avoiding keywords when updating the metadata's schema
> {code:java}
> bin/schematool -dbType postgres -upgradeSchemaFrom 3.0.0{code}
> went like this:
> {code:java}
> ALTER TABLE "GLOBAL_PRIVS" ADD COLUMN "AUTHORIZER" character varying(128) 
> DEFAULT NULL::character varying
> Error: ERROR: column "AUTHORIZER" of relation "GLOBAL_PRIVS" already exists 
> (state=42701,code=0){code}
> {code:java}
> ALTER TABLE COMPLETED_TXN_COMPONENTS ADD COLUMN IF NOT EXISTS 
> CTC_UPDATE_DELETE char(1) NULL
> Error: ERROR: relation "completed_txn_components" does not exist 
> (state=42P01,code=0)
> {code}
> I've already come up with a solution and created a pull request for this 
> issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24105) Refactor partition pruning

2020-09-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24105?focusedWorklogId=477535=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-477535
 ]

ASF GitHub Bot logged work on HIVE-24105:
-

Author: ASF GitHub Bot
Created on: 02/Sep/20 00:02
Start Date: 02/Sep/20 00:02
Worklog Time Spent: 10m 
  Work Description: scarlin-cloudera closed pull request #1454:
URL: https://github.com/apache/hive/pull/1454


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 477535)
Time Spent: 20m  (was: 10m)

> Refactor partition pruning
> --
>
> Key: HIVE-24105
> URL: https://issues.apache.org/jira/browse/HIVE-24105
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Steve Carlin
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> A small refactor of partition pruning.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24105) Refactor partition pruning

2020-09-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24105?focusedWorklogId=477536=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-477536
 ]

ASF GitHub Bot logged work on HIVE-24105:
-

Author: ASF GitHub Bot
Created on: 02/Sep/20 00:02
Start Date: 02/Sep/20 00:02
Worklog Time Spent: 10m 
  Work Description: scarlin-cloudera opened a new pull request #1454:
URL: https://github.com/apache/hive/pull/1454


   A small refactor of partition pruning.
   
   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 477536)
Time Spent: 0.5h  (was: 20m)

> Refactor partition pruning
> --
>
> Key: HIVE-24105
> URL: https://issues.apache.org/jira/browse/HIVE-24105
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Steve Carlin
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> A small refactor of partition pruning.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-21498) Upgrade Thrift to 0.12.0

2020-09-01 Thread Sai Hemanth Gantasala (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-21498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17188869#comment-17188869
 ] 

Sai Hemanth Gantasala commented on HIVE-21498:
--

This is the PR: [https://github.com/apache/hive/pull/1455]

> Upgrade Thrift to 0.12.0
> 
>
> Key: HIVE-21498
> URL: https://issues.apache.org/jira/browse/HIVE-21498
> Project: Hive
>  Issue Type: Bug
>  Components: Thrift API
>Reporter: Ajith S
>Assignee: Sai Hemanth Gantasala
>Priority: Critical
>  Labels: security
>
> Upgrade to consider security fixes.
> Especially https://issues.apache.org/jira/browse/THRIFT-4506



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-22622) Hive allows to create a struct with duplicate attribute names

2020-09-01 Thread Jesus Camacho Rodriguez (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez resolved HIVE-22622.

Fix Version/s: 4.0.0
   Resolution: Fixed

Pushed to master, thanks [~kkasa]!

> Hive allows to create a struct with duplicate attribute names
> -
>
> Key: HIVE-22622
> URL: https://issues.apache.org/jira/browse/HIVE-22622
> Project: Hive
>  Issue Type: Bug
>Reporter: Denys Kuzmenko
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> When you create at table with a struct with twice the same attribute name, 
> hive allow you to create it.
> create table test_struct( duplicateColumn struct);
> You can insert data into it :
> insert into test_struct select named_struct("id",1,"id",1);
> But you can not read it :
> select * from test_struct;
> Return : java.io.IOException: java.io.IOException: Error reading file: 
> hdfs://.../test_struct/delta_001_001_/bucket_0 ,
> We can create and insert. but fail on read the Struct part of the tables. We 
> can still read all other columns (if we have more than one) but not the 
> struct anymore.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22622) Hive allows to create a struct with duplicate attribute names

2020-09-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22622?focusedWorklogId=477493=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-477493
 ]

ASF GitHub Bot logged work on HIVE-22622:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 22:26
Start Date: 01/Sep/20 22:26
Worklog Time Spent: 10m 
  Work Description: jcamachor merged pull request #1446:
URL: https://github.com/apache/hive/pull/1446


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 477493)
Time Spent: 50m  (was: 40m)

> Hive allows to create a struct with duplicate attribute names
> -
>
> Key: HIVE-22622
> URL: https://issues.apache.org/jira/browse/HIVE-22622
> Project: Hive
>  Issue Type: Bug
>Reporter: Denys Kuzmenko
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> When you create at table with a struct with twice the same attribute name, 
> hive allow you to create it.
> create table test_struct( duplicateColumn struct);
> You can insert data into it :
> insert into test_struct select named_struct("id",1,"id",1);
> But you can not read it :
> select * from test_struct;
> Return : java.io.IOException: java.io.IOException: Error reading file: 
> hdfs://.../test_struct/delta_001_001_/bucket_0 ,
> We can create and insert. but fail on read the Struct part of the tables. We 
> can still read all other columns (if we have more than one) but not the 
> struct anymore.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-21000) Upgrade thrift to at least 0.10.0

2020-09-01 Thread Thejas Nair (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-21000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17188740#comment-17188740
 ] 

Thejas Nair commented on HIVE-21000:


Will track in HIVE-21498 

> Upgrade thrift to at least 0.10.0
> -
>
> Key: HIVE-21000
> URL: https://issues.apache.org/jira/browse/HIVE-21000
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-21000.01.patch, HIVE-21000.02.patch, 
> HIVE-21000.03.patch, HIVE-21000.04.patch, HIVE-21000.05.patch, 
> HIVE-21000.06.patch, HIVE-21000.07.patch, HIVE-21000.08.patch, 
> sampler_before.png
>
>
> I was looking into some compile profiles for tables with lots of columns; and 
> it turned out that [thrift 0.9.3 is allocating a 
> List|https://github.com/apache/hive/blob/8e30b5e029570407d8a1db67d322a95db705750e/standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/FieldSchema.java#L348]
>  during every hashcode calculation; but luckily THRIFT-2877 is improving on 
> that - so I propose to upgrade to at least 0.10.0 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-21498) Upgrade Thrift to 0.12.0

2020-09-01 Thread Thejas Nair (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas Nair reassigned HIVE-21498:
--

Assignee: Sai Hemanth Gantasala

> Upgrade Thrift to 0.12.0
> 
>
> Key: HIVE-21498
> URL: https://issues.apache.org/jira/browse/HIVE-21498
> Project: Hive
>  Issue Type: Bug
>  Components: Thrift API
>Reporter: Ajith S
>Assignee: Sai Hemanth Gantasala
>Priority: Critical
>  Labels: security
>
> Upgrade to consider security fixes.
> Especially https://issues.apache.org/jira/browse/THRIFT-4506



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-21498) Upgrade Thrift to 0.12.0

2020-09-01 Thread Thejas Nair (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-21498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17188741#comment-17188741
 ] 

Thejas Nair commented on HIVE-21498:


[~hemanth619]

Can you please add link to the pull request ?

 

> Upgrade Thrift to 0.12.0
> 
>
> Key: HIVE-21498
> URL: https://issues.apache.org/jira/browse/HIVE-21498
> Project: Hive
>  Issue Type: Bug
>  Components: Thrift API
>Reporter: Ajith S
>Assignee: Sai Hemanth Gantasala
>Priority: Critical
>  Labels: security
>
> Upgrade to consider security fixes.
> Especially https://issues.apache.org/jira/browse/THRIFT-4506



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-21000) Upgrade thrift to at least 0.10.0

2020-09-01 Thread Thejas Nair (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas Nair updated HIVE-21000:
---
Resolution: Duplicate
Status: Resolved  (was: Patch Available)

> Upgrade thrift to at least 0.10.0
> -
>
> Key: HIVE-21000
> URL: https://issues.apache.org/jira/browse/HIVE-21000
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-21000.01.patch, HIVE-21000.02.patch, 
> HIVE-21000.03.patch, HIVE-21000.04.patch, HIVE-21000.05.patch, 
> HIVE-21000.06.patch, HIVE-21000.07.patch, HIVE-21000.08.patch, 
> sampler_before.png
>
>
> I was looking into some compile profiles for tables with lots of columns; and 
> it turned out that [thrift 0.9.3 is allocating a 
> List|https://github.com/apache/hive/blob/8e30b5e029570407d8a1db67d322a95db705750e/standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/FieldSchema.java#L348]
>  during every hashcode calculation; but luckily THRIFT-2877 is improving on 
> that - so I propose to upgrade to at least 0.10.0 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24105) Refactor partition pruning

2020-09-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24105:
--
Labels: pull-request-available  (was: )

> Refactor partition pruning
> --
>
> Key: HIVE-24105
> URL: https://issues.apache.org/jira/browse/HIVE-24105
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Steve Carlin
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> A small refactor of partition pruning.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24105) Refactor partition pruning

2020-09-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24105?focusedWorklogId=477371=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-477371
 ]

ASF GitHub Bot logged work on HIVE-24105:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 17:47
Start Date: 01/Sep/20 17:47
Worklog Time Spent: 10m 
  Work Description: scarlin-cloudera opened a new pull request #1454:
URL: https://github.com/apache/hive/pull/1454


   A small refactor of partition pruning.
   
   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 477371)
Remaining Estimate: 0h
Time Spent: 10m

> Refactor partition pruning
> --
>
> Key: HIVE-24105
> URL: https://issues.apache.org/jira/browse/HIVE-24105
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Steve Carlin
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> A small refactor of partition pruning.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24104) NPE due to null key columns in ReduceSink after deduplication

2020-09-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24104?focusedWorklogId=477305=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-477305
 ]

ASF GitHub Bot logged work on HIVE-24104:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 16:05
Start Date: 01/Sep/20 16:05
Worklog Time Spent: 10m 
  Work Description: zabetak opened a new pull request #1453:
URL: https://github.com/apache/hive/pull/1453


   ### What changes were proposed in this pull request?
   
   Remove double backtracking of columns inside 
ReduceSinkDeDuplicationUtils#ggressiveDedup.
   
   ### Why are the changes needed?
   
   To prevent NPE during planning or execution. Examples in the JIRA case.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Changes in the EXPLAIN plans but probably for the best.
   
   
   ### How was this patch tested?
   `mvn test -Dtest=TestMiniLlapLocalCliDriver 
-Dqfile=reduce_deduplicate_null_keys.q`



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 477305)
Remaining Estimate: 0h
Time Spent: 10m

> NPE due to null key columns in ReduceSink after deduplication
> -
>
> Key: HIVE-24104
> URL: https://issues.apache.org/jira/browse/HIVE-24104
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In some cases the {{ReduceSinkDeDuplication}} optimization creates ReduceSink 
> operators where the key columns are null. This can lead to NPE in various 
> places in the code. 
> The following stracktraces show some places where a NPE appears. Note that 
> the stacktraces do not correspond to the same query.
> +NPE  during planning+
> {noformat}
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.plan.ExprNodeDesc$ExprNodeDescEqualityWrapper.equals(ExprNodeDesc.java:141)
>   at java.util.AbstractList.equals(AbstractList.java:523)
>   at 
> org.apache.hadoop.hive.ql.optimizer.SetReducerParallelism.process(SetReducerParallelism.java:101)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
>   at 
> org.apache.hadoop.hive.ql.lib.ForwardWalker.walk(ForwardWalker.java:74)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
>   at 
> org.apache.hadoop.hive.ql.parse.TezCompiler.runStatsDependentOptimizations(TezCompiler.java:492)
>   at 
> org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeOperatorPlan(TezCompiler.java:226)
>   at 
> org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:161)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12643)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:443)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:301)
>   at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:171)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:301)
>   at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:220)
>   at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:104)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:173)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:414)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:363)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:357)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:129)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:231)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd1(CliDriver.java:203)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:129)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:424)
>

[jira] [Updated] (HIVE-24104) NPE due to null key columns in ReduceSink after deduplication

2020-09-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24104:
--
Labels: pull-request-available  (was: )

> NPE due to null key columns in ReduceSink after deduplication
> -
>
> Key: HIVE-24104
> URL: https://issues.apache.org/jira/browse/HIVE-24104
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In some cases the {{ReduceSinkDeDuplication}} optimization creates ReduceSink 
> operators where the key columns are null. This can lead to NPE in various 
> places in the code. 
> The following stracktraces show some places where a NPE appears. Note that 
> the stacktraces do not correspond to the same query.
> +NPE  during planning+
> {noformat}
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.plan.ExprNodeDesc$ExprNodeDescEqualityWrapper.equals(ExprNodeDesc.java:141)
>   at java.util.AbstractList.equals(AbstractList.java:523)
>   at 
> org.apache.hadoop.hive.ql.optimizer.SetReducerParallelism.process(SetReducerParallelism.java:101)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
>   at 
> org.apache.hadoop.hive.ql.lib.ForwardWalker.walk(ForwardWalker.java:74)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
>   at 
> org.apache.hadoop.hive.ql.parse.TezCompiler.runStatsDependentOptimizations(TezCompiler.java:492)
>   at 
> org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeOperatorPlan(TezCompiler.java:226)
>   at 
> org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:161)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12643)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:443)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:301)
>   at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:171)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:301)
>   at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:220)
>   at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:104)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:173)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:414)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:363)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:357)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:129)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:231)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd1(CliDriver.java:203)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:129)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:424)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:355)
>   at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:740)
>   at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:710)
>   at 
> org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:170)
>   at 
> org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:157)
>   at 
> org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver(TestMiniLlapLocalCliDriver.java:62)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
>

[jira] [Work logged] (HIVE-24089) Run QB compaction as table directory user with impersonation

2020-09-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24089?focusedWorklogId=477304=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-477304
 ]

ASF GitHub Bot logged work on HIVE-24089:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 16:04
Start Date: 01/Sep/20 16:04
Worklog Time Spent: 10m 
  Work Description: klcopp merged pull request #1441:
URL: https://github.com/apache/hive/pull/1441


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 477304)
Time Spent: 20m  (was: 10m)

> Run QB compaction as table directory user with impersonation
> 
>
> Key: HIVE-24089
> URL: https://issues.apache.org/jira/browse/HIVE-24089
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently QB compaction runs as the session user, unlike MR compaction which 
> runs as the table/partition directory owner (see 
> CompactorThread#findUserToRunAs).
> We should make QB compaction run as the table/partition directory owner and 
> enable user impersonation during compaction to avoid any issues with temp 
> directories.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24104) NPE due to null key columns in ReduceSink after deduplication

2020-09-01 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-24104:
---
Description: 
In some cases the {{ReduceSinkDeDuplication}} optimization creates ReduceSink 
operators where the key columns are null. This can lead to NPE in various 
places in the code. 

The following stracktraces show some places where a NPE appears. Note that the 
stacktraces do not correspond to the same query.

+NPE  during planning+
{noformat}
java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.plan.ExprNodeDesc$ExprNodeDescEqualityWrapper.equals(ExprNodeDesc.java:141)
at java.util.AbstractList.equals(AbstractList.java:523)
at 
org.apache.hadoop.hive.ql.optimizer.SetReducerParallelism.process(SetReducerParallelism.java:101)
at 
org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
at 
org.apache.hadoop.hive.ql.lib.ForwardWalker.walk(ForwardWalker.java:74)
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
at 
org.apache.hadoop.hive.ql.parse.TezCompiler.runStatsDependentOptimizations(TezCompiler.java:492)
at 
org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeOperatorPlan(TezCompiler.java:226)
at 
org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:161)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12643)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:443)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:301)
at 
org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:171)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:301)
at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:220)
at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:104)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:173)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:414)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:363)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:357)
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:129)
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:231)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
at org.apache.hadoop.hive.cli.CliDriver.processCmd1(CliDriver.java:203)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:129)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:424)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:355)
at 
org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:740)
at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:710)
at 
org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:170)
at 
org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:157)
at 
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver(TestMiniLlapLocalCliDriver.java:62)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.apache.hadoop.hive.cli.control.CliAdapter$2$1.evaluate(CliAdapter.java:135)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at 
org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
at

[jira] [Work logged] (HIVE-23852) Natively support Date type in ReduceSink operator

2020-09-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23852?focusedWorklogId=477292=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-477292
 ]

ASF GitHub Bot logged work on HIVE-23852:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 15:50
Start Date: 01/Sep/20 15:50
Worklog Time Spent: 10m 
  Work Description: pgaref commented on pull request #1274:
URL: https://github.com/apache/hive/pull/1274#issuecomment-684953747


   @abstractdog could you please check this when you have a sec?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 477292)
Time Spent: 1h 20m  (was: 1h 10m)

> Natively support Date type in ReduceSink operator
> -
>
> Key: HIVE-23852
> URL: https://issues.apache.org/jira/browse/HIVE-23852
> Project: Hive
>  Issue Type: Improvement
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> There is no native support currently meaning that these types end up being 
> serialized as multi-key columns which is much slower (iterating through batch 
> columns instead of writing a value directly)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-24087) FK side join elimination in presence of PK-FK constraint

2020-09-01 Thread Vineet Garg (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg resolved HIVE-24087.

Fix Version/s: 4.0.0
   Resolution: Fixed

> FK side join elimination in presence of PK-FK constraint
> 
>
> Key: HIVE-24087
> URL: https://issues.apache.org/jira/browse/HIVE-24087
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> If there is PK-FK join FK join could be eliminated by removing FK side if 
> following conditions are met
> * There is no row filtering on FK side.
> * No columns from FK side is required after JOIN.
> * FK join columns are guranteed to be unique (have group by)
> * FK join columns are guranteed to be NOT NULL (either IS NOT NULL filter or 
> constraint)
> *Example*
> {code:sql}
> EXPLAIN 
> SELECT customer_removal_n0.*
> FROM customer_removal_n0
> JOIN
> (SELECT lo_custkey
> FROM lineorder_removal_n0
> WHERE lo_custkey IS NOT NULL
> GROUP BY lo_custkey) fkSide ON fkSide.lo_custkey = 
> customer_removal_n0.c_custkey;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24087) FK side join elimination in presence of PK-FK constraint

2020-09-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24087?focusedWorklogId=477269=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-477269
 ]

ASF GitHub Bot logged work on HIVE-24087:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 15:15
Start Date: 01/Sep/20 15:15
Worklog Time Spent: 10m 
  Work Description: vineetgarg02 merged pull request #1440:
URL: https://github.com/apache/hive/pull/1440


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 477269)
Time Spent: 1h 10m  (was: 1h)

> FK side join elimination in presence of PK-FK constraint
> 
>
> Key: HIVE-24087
> URL: https://issues.apache.org/jira/browse/HIVE-24087
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> If there is PK-FK join FK join could be eliminated by removing FK side if 
> following conditions are met
> * There is no row filtering on FK side.
> * No columns from FK side is required after JOIN.
> * FK join columns are guranteed to be unique (have group by)
> * FK join columns are guranteed to be NOT NULL (either IS NOT NULL filter or 
> constraint)
> *Example*
> {code:sql}
> EXPLAIN 
> SELECT customer_removal_n0.*
> FROM customer_removal_n0
> JOIN
> (SELECT lo_custkey
> FROM lineorder_removal_n0
> WHERE lo_custkey IS NOT NULL
> GROUP BY lo_custkey) fkSide ON fkSide.lo_custkey = 
> customer_removal_n0.c_custkey;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24081) Enable pre-materializing CTEs referenced in scalar subqueries

2020-09-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24081?focusedWorklogId=477266=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-477266
 ]

ASF GitHub Bot logged work on HIVE-24081:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 15:14
Start Date: 01/Sep/20 15:14
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on a change in pull request #1437:
URL: https://github.com/apache/hive/pull/1437#discussion_r481221210



##
File path: ql/src/test/results/clientpositive/llap/cte_4.q.out
##
@@ -106,23 +106,17 @@ PREHOOK: query: create table s2 as
 with q1 as ( select key from src where key = '4')
 select * from q1
 PREHOOK: type: CREATETABLE_AS_SELECT
-PREHOOK: Input: default@q1

Review comment:
   Expected?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 477266)
Time Spent: 1h 40m  (was: 1.5h)

> Enable pre-materializing CTEs referenced in scalar subqueries
> -
>
> Key: HIVE-24081
> URL: https://issues.apache.org/jira/browse/HIVE-24081
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> HIVE-11752 introduces materializing CTE based on config
> {code}
> hive.optimize.cte.materialize.threshold
> {code}
> Goal of this jira is
> * extending the implementation to support materializing CTE's referenced in 
> scalar subqueries
> * add a config to materialize CTEs with aggregate output only



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24081) Enable pre-materializing CTEs referenced in scalar subqueries

2020-09-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24081?focusedWorklogId=477265=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-477265
 ]

ASF GitHub Bot logged work on HIVE-24081:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 15:13
Start Date: 01/Sep/20 15:13
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on a change in pull request #1437:
URL: https://github.com/apache/hive/pull/1437#discussion_r481220742



##
File path: ql/src/test/queries/clientpositive/cte_mat_6.q
##
@@ -0,0 +1,81 @@
+set hive.optimize.cte.materialize.threshold=1;
+
+create table t0(col0 int);
+
+insert into t0(col0) values
+(1),(2),
+(100),(100),(100),
+(200),(200);
+
+-- CTE is referenced from scalar subquery in the select clause
+explain
+with cte as (select count(*) as small_count from t0 where col0 < 10)
+select t0.col0, (select small_count from cte)
+from t0
+order by t0.col0;
+
+with cte as (select count(*) as small_count from t0 where col0 < 10)
+select t0.col0, (select small_count from cte)
+from t0
+order by t0.col0;
+
+-- disable cte materialization
+set hive.optimize.cte.materialize.threshold=-1;
+
+explain
+with cte as (select count(*) as small_count from t0 where col0 < 10)
+select t0.col0, (select small_count from cte)
+from t0
+order by t0.col0;
+
+
+with cte as (select count(*) as small_count from t0 where col0 < 10)
+select t0.col0, (select small_count from cte)
+from t0
+order by t0.col0;
+
+
+-- enable cte materialization
+set hive.optimize.cte.materialize.threshold=1;
+
+-- CTE is referenced from scalar subquery in the where clause
+explain
+with cte as (select count(*) as small_count from t0 where col0 < 10)
+select t0.col0
+from t0
+where t0.col0 > (select small_count from cte)
+order by t0.col0;
+
+with cte as (select count(*) as small_count from t0 where col0 < 10)
+select t0.col0
+from t0
+where t0.col0 > (select small_count from cte)
+order by t0.col0;
+
+-- CTE is referenced from scalar subquery in the having clause
+explain
+with cte as (select count(*) as small_count from t0 where col0 < 10)
+select t0.col0, count(*)
+from t0
+group by col0
+having count(*) > (select small_count from cte)
+order by t0.col0;
+
+with cte as (select count(*) as small_count from t0 where col0 < 10)
+select t0.col0, count(*)
+from t0
+group by col0
+having count(*) > (select small_count from cte)
+order by t0.col0;
+
+-- mix full aggregate and non-full aggregate ctes
+explain
+with cte1 as (select col0 as k1 from t0 where col0 = '5'),
+ cte2 as (select count(*) as all_count from t0),
+ cte3 as (select col0 as k3, col0 + col0 as k3_2x, count(*) as key_count 
from t0 group by col0)
+select t0.col0, count(*)
+from t0
+join cte1 on t0.col0 = cte1.k1
+join cte3 on t0.col0 = cte3.k3
+group by col0
+having count(*) > (select all_count from cte2)

Review comment:
   OK, I think it is fine to leave it then.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 477265)
Time Spent: 1.5h  (was: 1h 20m)

> Enable pre-materializing CTEs referenced in scalar subqueries
> -
>
> Key: HIVE-24081
> URL: https://issues.apache.org/jira/browse/HIVE-24081
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> HIVE-11752 introduces materializing CTE based on config
> {code}
> hive.optimize.cte.materialize.threshold
> {code}
> Goal of this jira is
> * extending the implementation to support materializing CTE's referenced in 
> scalar subqueries
> * add a config to materialize CTEs with aggregate output only



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-24102) Add ENGINE=InnoDB for replication mysql schema changes and not exists clause for the table creation

2020-09-01 Thread Pravin Sinha (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17188454#comment-17188454
 ] 

Pravin Sinha commented on HIVE-24102:
-

+1

> Add ENGINE=InnoDB for replication mysql schema changes and not exists clause 
> for the table creation
> ---
>
> Key: HIVE-24102
> URL: https://issues.apache.org/jira/browse/HIVE-24102
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24102.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24093) Remove unused hive.debug.localtask

2020-09-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24093?focusedWorklogId=477218=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-477218
 ]

ASF GitHub Bot logged work on HIVE-24093:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 13:14
Start Date: 01/Sep/20 13:14
Worklog Time Spent: 10m 
  Work Description: pgaref commented on pull request #1445:
URL: https://github.com/apache/hive/pull/1445#issuecomment-684843982


   +1 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 477218)
Time Spent: 20m  (was: 10m)

> Remove unused hive.debug.localtask
> --
>
> Key: HIVE-24093
> URL: https://issues.apache.org/jira/browse/HIVE-24093
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mustafa Iman
>Assignee: Mustafa Iman
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> hive.debug.local.task was added in HIVE-1642. Even then, it was never used. 
> It was possibly a leftover from development/debugging. There are no 
> references to either HIVEDEBUGLOCALTASK or hive.debug.localtask in the 
> codebase.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-23976) Enable vectorization for multi-col semi join reducers

2020-09-01 Thread Jira



 [ 
https://issues.apache.org/jira/browse/HIVE-23976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor reassigned HIVE-23976:
---

Assignee: László Bodor  (was: Stamatis Zampetakis)

> Enable vectorization for multi-col semi join reducers
> -
>
> Key: HIVE-23976
> URL: https://issues.apache.org/jira/browse/HIVE-23976
> Project: Hive
>  Issue Type: Improvement
>Reporter: Stamatis Zampetakis
>Assignee: László Bodor
>Priority: Major
>
> HIVE-21196 introduces multi-column semi-join reducers in the query engine. 
> However, the implementation relies on GenericUDFMurmurHash which is not 
> vectorized thus the respective operators cannot be executed in vectorized 
> mode. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23737) LLAP: Reuse dagDelete Feature Of Tez Custom Shuffle Handler Instead Of LLAP's dagDelete

2020-09-01 Thread Syed Shameerur Rahman (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17188409#comment-17188409
 ] 

Syed Shameerur Rahman commented on HIVE-23737:
--

[~gopalv] [~rajesh.balamohan] ping for review request!

> LLAP: Reuse dagDelete Feature Of Tez Custom Shuffle Handler Instead Of LLAP's 
> dagDelete
> ---
>
> Key: HIVE-23737
> URL: https://issues.apache.org/jira/browse/HIVE-23737
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> LLAP have a dagDelete feature added as part of HIVE-9911, But now that Tez 
> have added support for dagDelete in custom shuffle handler (TEZ-3362) we 
> could re-use that feature in LLAP. 
> There are some added advantages of using Tez's dagDelete feature rather than 
> the current LLAP's dagDelete feature.
> 1) We can easily extend this feature to accommodate the upcoming features 
> such as vertex and failed task attempt shuffle data clean up. Refer TEZ-3363 
> and TEZ-4129
> 2) It will be more easier to maintain this feature by separating it out from 
> the Hive's code path. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Issue Comment Deleted] (HIVE-23737) LLAP: Reuse dagDelete Feature Of Tez Custom Shuffle Handler Instead Of LLAP's dagDelete

2020-09-01 Thread Syed Shameerur Rahman (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Syed Shameerur Rahman updated HIVE-23737:
-
Comment: was deleted

(was: [~szita] [~rajesh.balamohan] Ping for review request!)

> LLAP: Reuse dagDelete Feature Of Tez Custom Shuffle Handler Instead Of LLAP's 
> dagDelete
> ---
>
> Key: HIVE-23737
> URL: https://issues.apache.org/jira/browse/HIVE-23737
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> LLAP have a dagDelete feature added as part of HIVE-9911, But now that Tez 
> have added support for dagDelete in custom shuffle handler (TEZ-3362) we 
> could re-use that feature in LLAP. 
> There are some added advantages of using Tez's dagDelete feature rather than 
> the current LLAP's dagDelete feature.
> 1) We can easily extend this feature to accommodate the upcoming features 
> such as vertex and failed task attempt shuffle data clean up. Refer TEZ-3363 
> and TEZ-4129
> 2) It will be more easier to maintain this feature by separating it out from 
> the Hive's code path. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24059) Llap external client - Initial changes for running in cloud environment

2020-09-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24059?focusedWorklogId=477182=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-477182
 ]

ASF GitHub Bot logged work on HIVE-24059:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 12:09
Start Date: 01/Sep/20 12:09
Worklog Time Spent: 10m 
  Work Description: ShubhamChaurasia commented on a change in pull request 
#1418:
URL: https://github.com/apache/hive/pull/1418#discussion_r481088959



##
File path: 
llap-common/src/java/org/apache/hadoop/hive/llap/security/DefaultJwtSharedSecretProvider.java
##
@@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.llap.security;
+
+import com.google.common.base.Preconditions;
+import io.jsonwebtoken.security.Keys;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.hive.conf.HiveConf;
+
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.nio.CharBuffer;
+import java.nio.charset.StandardCharsets;
+import java.security.Key;
+
+import static 
org.apache.hadoop.hive.conf.HiveConf.ConfVars.LLAP_EXTERNAL_CLIENT_CLOUD_DEPLOYMENT_SETUP_ENABLED;
+import static 
org.apache.hadoop.hive.conf.HiveConf.ConfVars.LLAP_EXTERNAL_CLIENT_CLOUD_JWT_SHARED_SECRET;
+
+/**
+ * Default implementation of {@link JwtSecretProvider}.
+ *
+ * 1. It first tries to get shared secret from conf {@link 
HiveConf.ConfVars#LLAP_EXTERNAL_CLIENT_CLOUD_JWT_SHARED_SECRET}
+ * using {@link Configuration#getPassword(String)}.
+ *
+ * 2. If not found, it tries to read from env var {@link 
#LLAP_EXTERNAL_CLIENT_CLOUD_JWT_SHARED_SECRET_ENV_VAR}.
+ *
+ * If secret is not found even after 1) and 2), {@link #init(Configuration)} 
methods throws {@link NullPointerException}.
+ *
+ * It uses the same encryption and decryption secret which can be used to sign 
and verify JWT.
+ */
+public class DefaultJwtSharedSecretProvider implements JwtSecretProvider {
+
+  public static final String 
LLAP_EXTERNAL_CLIENT_CLOUD_JWT_SHARED_SECRET_ENV_VAR =
+  "LLAP_EXTERNAL_CLIENT_CLOUD_JWT_SHARED_SECRET_ENV_VAR";
+
+  private Key jwtEncryptionKey;
+
+  @Override public Key getEncryptionSecret() {
+return jwtEncryptionKey;
+  }
+
+  @Override public Key getDecryptionSecret() {
+return jwtEncryptionKey;
+  }
+
+  @Override public void init(final Configuration conf) {
+char[] sharedSecret;
+byte[] sharedSecretBytes = null;
+
+// try getting secret from conf first
+// if not found, get from env var - 
LLAP_EXTERNAL_CLIENT_CLOUD_JWT_SHARED_SECRET_ENV_VAR
+try {
+  sharedSecret = 
conf.getPassword(LLAP_EXTERNAL_CLIENT_CLOUD_JWT_SHARED_SECRET.varname);
+} catch (IOException e) {
+  throw new RuntimeException("Unable to get password 
[hive.llap.external.client.cloud.jwt.shared.secret] - "
+  + e.getMessage(), e);
+}
+if (sharedSecret != null) {
+  ByteBuffer bb = 
StandardCharsets.UTF_8.encode(CharBuffer.wrap(sharedSecret));
+  sharedSecretBytes = new byte[bb.remaining()];
+  bb.get(sharedSecretBytes);
+} else {
+  String sharedSecredFromEnv = 
System.getenv(LLAP_EXTERNAL_CLIENT_CLOUD_JWT_SHARED_SECRET_ENV_VAR);
+  if (sharedSecredFromEnv != null) {
+sharedSecretBytes = sharedSecredFromEnv.getBytes();
+  }
+}
+
+Preconditions.checkNotNull(sharedSecretBytes,

Review comment:
   done

##
File path: 
llap-common/src/java/org/apache/hadoop/hive/llap/security/DefaultJwtSharedSecretProvider.java
##
@@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is

[jira] [Assigned] (HIVE-24104) NPE due to null key columns in ReduceSink after deduplication

2020-09-01 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis reassigned HIVE-24104:
--


> NPE due to null key columns in ReduceSink after deduplication
> -
>
> Key: HIVE-24104
> URL: https://issues.apache.org/jira/browse/HIVE-24104
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>
> In some cases the {{ReduceSinkDeDuplication}} optimization creates ReduceSink 
> operators where the key columns are null. This can lead to NPE in various 
> places in the code. 
> The following stracktrace shows an example where NPE is raised due to key 
> columns being null.
> {noformat}
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.plan.ExprNodeDesc$ExprNodeDescEqualityWrapper.equals(ExprNodeDesc.java:141)
>   at java.util.AbstractList.equals(AbstractList.java:523)
>   at 
> org.apache.hadoop.hive.ql.optimizer.SetReducerParallelism.process(SetReducerParallelism.java:101)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
>   at 
> org.apache.hadoop.hive.ql.lib.ForwardWalker.walk(ForwardWalker.java:74)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
>   at 
> org.apache.hadoop.hive.ql.parse.TezCompiler.runStatsDependentOptimizations(TezCompiler.java:492)
>   at 
> org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeOperatorPlan(TezCompiler.java:226)
>   at 
> org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:161)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12643)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:443)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:301)
>   at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:171)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:301)
>   at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:220)
>   at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:104)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:173)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:414)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:363)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:357)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:129)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:231)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd1(CliDriver.java:203)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:129)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:424)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:355)
>   at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:740)
>   at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:710)
>   at 
> org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:170)
>   at 
> org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:157)
>   at 
> org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver(TestMiniLlapLocalCliDriver.java:62)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.apache.hadoop.hive.cli.control.CliAdapter$2$1.evaluate(CliAdapter.java:135)
>   at

[jira] [Updated] (HIVE-24103) TezClassLoader should be used in TezChild and for Configuration objects

2020-09-01 Thread Jira



 [ 
https://issues.apache.org/jira/browse/HIVE-24103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-24103:

Attachment: syslog_good.log
syslog_bad.log
hive_llap.log

> TezClassLoader should be used in TezChild and for Configuration objects
> ---
>
> Key: HIVE-24103
> URL: https://issues.apache.org/jira/browse/HIVE-24103
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Blocker
>  Labels: 0.10_blocker
> Fix For: 0.10.0
>
> Attachments: hive_llap.log, syslog_bad.log, syslog_good.log
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24103) TezClassLoader should be used in TezChild and for Configuration objects

2020-09-01 Thread Jira



 [ 
https://issues.apache.org/jira/browse/HIVE-24103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-24103:

Priority: Blocker  (was: Major)

> TezClassLoader should be used in TezChild and for Configuration objects
> ---
>
> Key: HIVE-24103
> URL: https://issues.apache.org/jira/browse/HIVE-24103
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Blocker
> Fix For: 0.10.0
>
> Attachments: hive_llap.log, syslog_bad.log, syslog_good.log
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-24103) TezClassLoader should be used in TezChild and for Configuration objects

2020-09-01 Thread Jira



 [ 
https://issues.apache.org/jira/browse/HIVE-24103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor reassigned HIVE-24103:
---

Assignee: László Bodor

> TezClassLoader should be used in TezChild and for Configuration objects
> ---
>
> Key: HIVE-24103
> URL: https://issues.apache.org/jira/browse/HIVE-24103
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
> Fix For: 0.10.0
>
> Attachments: hive_llap.log, syslog_bad.log, syslog_good.log
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24103) TezClassLoader should be used in TezChild and for Configuration objects

2020-09-01 Thread Jira



 [ 
https://issues.apache.org/jira/browse/HIVE-24103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-24103:

Fix Version/s: 0.10.0

> TezClassLoader should be used in TezChild and for Configuration objects
> ---
>
> Key: HIVE-24103
> URL: https://issues.apache.org/jira/browse/HIVE-24103
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Priority: Major
> Fix For: 0.10.0
>
> Attachments: hive_llap.log, syslog_bad.log, syslog_good.log
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24103) TezClassLoader should be used in TezChild and for Configuration objects

2020-09-01 Thread Jira



 [ 
https://issues.apache.org/jira/browse/HIVE-24103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-24103:

Labels: 0.10_blocker  (was: )

> TezClassLoader should be used in TezChild and for Configuration objects
> ---
>
> Key: HIVE-24103
> URL: https://issues.apache.org/jira/browse/HIVE-24103
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Blocker
>  Labels: 0.10_blocker
> Fix For: 0.10.0
>
> Attachments: hive_llap.log, syslog_bad.log, syslog_good.log
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work started] (HIVE-24103) TezClassLoader should be used in TezChild and for Configuration objects

2020-09-01 Thread Jira



 [ 
https://issues.apache.org/jira/browse/HIVE-24103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-24103 started by László Bodor.
---
> TezClassLoader should be used in TezChild and for Configuration objects
> ---
>
> Key: HIVE-24103
> URL: https://issues.apache.org/jira/browse/HIVE-24103
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Blocker
>  Labels: 0.10_blocker
> Fix For: 0.10.0
>
> Attachments: hive_llap.log, syslog_bad.log, syslog_good.log
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24064) Disable Materialized View Replication

2020-09-01 Thread Anishek Agarwal (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-24064:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to master! thanks for the patch [~^sharma] and review [~aasha]

> Disable Materialized View Replication
> -
>
> Key: HIVE-24064
> URL: https://issues.apache.org/jira/browse/HIVE-24064
> Project: Hive
>  Issue Type: Bug
>Reporter: Arko Sharma
>Assignee: Arko Sharma
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24064.01.patch, HIVE-24064.02.patch, 
> HIVE-24064.03.patch, HIVE-24064.04.patch, HIVE-24064.05.patch, 
> HIVE-24064.06.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24102) Add ENGINE=InnoDB for replication mysql schema changes and not exists clause for the table creation

2020-09-01 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-24102:
---
Attachment: HIVE-24102.01.patch
Status: Patch Available  (was: In Progress)

> Add ENGINE=InnoDB for replication mysql schema changes and not exists clause 
> for the table creation
> ---
>
> Key: HIVE-24102
> URL: https://issues.apache.org/jira/browse/HIVE-24102
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24102.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24102) Add ENGINE=InnoDB for replication mysql schema changes and not exists clause for the table creation

2020-09-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24102?focusedWorklogId=477113=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-477113
 ]

ASF GitHub Bot logged work on HIVE-24102:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 09:58
Start Date: 01/Sep/20 09:58
Worklog Time Spent: 10m 
  Work Description: aasha opened a new pull request #1452:
URL: https://github.com/apache/hive/pull/1452


   …nd not exists clause for the table creation
   
   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 477113)
Remaining Estimate: 0h
Time Spent: 10m

> Add ENGINE=InnoDB for replication mysql schema changes and not exists clause 
> for the table creation
> ---
>
> Key: HIVE-24102
> URL: https://issues.apache.org/jira/browse/HIVE-24102
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24102) Add ENGINE=InnoDB for replication mysql schema changes and not exists clause for the table creation

2020-09-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24102:
--
Labels: pull-request-available  (was: )

> Add ENGINE=InnoDB for replication mysql schema changes and not exists clause 
> for the table creation
> ---
>
> Key: HIVE-24102
> URL: https://issues.apache.org/jira/browse/HIVE-24102
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work started] (HIVE-24102) Add ENGINE=InnoDB for replication mysql schema changes and not exists clause for the table creation

2020-09-01 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-24102 started by Aasha Medhi.
--
> Add ENGINE=InnoDB for replication mysql schema changes and not exists clause 
> for the table creation
> ---
>
> Key: HIVE-24102
> URL: https://issues.apache.org/jira/browse/HIVE-24102
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24102) Add ENGINE=InnoDB for replication mysql schema changes and not exists clause for the table creation

2020-09-01 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-24102:
---
Summary: Add ENGINE=InnoDB for replication mysql schema changes and not 
exists clause for the table creation  (was: Add ENGINE=InnoDB for replication 
mysql schema changes)

> Add ENGINE=InnoDB for replication mysql schema changes and not exists clause 
> for the table creation
> ---
>
> Key: HIVE-24102
> URL: https://issues.apache.org/jira/browse/HIVE-24102
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-24102) Add ENGINE=InnoDB for replication mysql schema changes

2020-09-01 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi reassigned HIVE-24102:
--


> Add ENGINE=InnoDB for replication mysql schema changes
> --
>
> Key: HIVE-24102
> URL: https://issues.apache.org/jira/browse/HIVE-24102
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22646) CTASing a dynamically partitioned MM table results in unreadable table

2020-09-01 Thread Karen Coppage (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Coppage updated HIVE-22646:
-
Status: Open  (was: Patch Available)

> CTASing a dynamically partitioned MM table results in unreadable table
> --
>
> Key: HIVE-22646
> URL: https://issues.apache.org/jira/browse/HIVE-22646
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Priority: Major
> Attachments: HIVE-22646.01.patch
>
>
> Repro steps: 
> {code:java}
> create table plain (i int, j int, s string);
> insert into plain values (1,1,'1');
> create table ctas partitioned by (s) tblproperties ('transactional'='true', 
> 'transactional_properties' = 'insert_only') as select * from plain;
> select * from ctas;
> {code}
>  
>  We get this error:
> {code:java}
> Error: java.io.IOException: java.io.IOException: Not a file: 
> file:/Users/karencoppage/data/upstream/warehouse/ctas/s=1/delta_002_002_/delta_002_002_
>  (state=,code=0){code}
> This also happens when CTASing from a dynamically partitioned table.
>  As seen in the error message, the issue is that a new delta directory is 
> created in the temp directory, and during MoveTask another delta dir is 
> unnecessarily created, then the first delta dir is moved into the second. The 
> table is unreadable since a file and not another delta dir is expected in the 
> top delta dir.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24100) Syntax compile failure occurs when INSERT table column Order by is greater than 2 columns when CBO is false

2020-09-01 Thread GuangMing Lu (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

GuangMing Lu updated HIVE-24100:

Description: 
Executing the following SQL will fail to compile
{code:java}
set hive.cbo.enable=false;

-- create tabls --
create table table_1
(
item_id string, 
stru_area_id string
)partitioned by ( PT_DT string) stored as orc;

create table table_2
(
CREATE_ORG_ID string,
PROMOTION_ID  string,
PROMOTION_STATUS string
) partitioned by (pt_dt string) stored as orc;

create table table_3
(
STRU_ID string,
SUP_STRU string
) partitioned by(pt_dt string) stored as orc;

set hive.cbo.enable=false;
-- execute sql--
explain
insert into table table_1 partition(PT_DT = '2020-08-22')
(item_id , stru_area_id)
select '123' ITEM_ID , T.STRU_ID STRU_AREA_ID 
from ( 
  select 
  T0.STRU_ID STRU_ID ,T0.STRU_ID STRU_ID_BRANCH 
  from  table_3 T0 
) T
inner join ( 
  select 
  TT.CREATE_ORG_ID
  from  table_2 TT 
) TIV
on (T.STRU_ID_BRANCH = TIV.CREATE_ORG_ID) 
group by T.STRU_ID
order by 1,2;
{code}
{code:java}
org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: 
FAILED: SemanticException [Error 10004]: Line 5:28 Invalid table alias or 
column reference 'T': (possible column names are: _col0, _col1)
 at 
org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:341)
 ~[hive-service-3.1.0.jar:3.1.0]
 at 
org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:215)
 ~[hive-service-3.1.0.jar:3.1.0]
 at 
org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:316)
 ~[hive-service-3.1.0.jar:3.1.0]
 at org.apache.hive.service.cli.operation.Operation.run(Operation.java:253) 
~[hive-service-3.1.0.jar:3.1.0]
 at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:684)
 ~[hive-service-3.1.0.jar:3.1.0]
 at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:670)
 ~[hive-service-3.1.0.jar:3.1.0]
 at 
org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:342)
 ~[hive-service-3.1.0.jar:3.1.0]
 at 
org.apache.hive.service.cli.thrift.ThriftCLIService.executeNewStatement(ThriftCLIService.java:1144)
 ~[hive-service-3.1.0.jar:3.1.0]
 at 
org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:1280)
 ~[hive-service-3.1.0.jar:3.1.0]
 at 
org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1557)
 ~[hive-exec-3.1.0.jar:3.1.0]
 at 
org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1542)
 ~[hive-exec-3.1.0.jar:3.1.0]
 at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) 
~[hive-exec-3.1.0.jar:3.1.0]
 at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) 
~[hive-exec-3.1.0.jar:3.1.0]
 at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:648)
 ~[hive-exec-3.1.0.jar:3.1.0]
 at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
 ~[hive-exec-3.1.0.jar:3.1.0]
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
~[?:1.8.0_201]
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
~[?:1.8.0_201]
 at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]
Caused by: org.apache.hadoop.hive.ql.parse.SemanticException: Line 5:28 Invalid 
table alias or column reference 'T': (possible column names are: _col0, _col1)
 at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:12689)
 ~[hive-exec-3.1.0.jar:3.1.0]
 at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:12629)
 ~[hive-exec-3.1.0.jar:3.1.0]
 at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:12597)
 ~[hive-exec-3.1.0.jar:3.1.0]
 at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:12575)
 ~[hive-exec-3.1.0.jar:3.1.0]
 at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genReduceSinkPlan(SemanticAnalyzer.java:8482)
 ~[hive-exec-3.1.0.jar:3.1.0]
 at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:10616)
 ~[hive-exec-3.1.0.jar:3.1.0]
 at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:10515)
 ~[hive-exec-3.1.0.jar:3.1.0]
 at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11434)
 ~[hive-exec-3.1.0.jar:3.1.0]
 at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11304)
 ~[hive-exec-3.1.0.jar:3.1.0]
 at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:12090)
 ~[hive-exec-3.1.0.jar:3.1.0]
 at

[jira] [Work logged] (HIVE-22622) Hive allows to create a struct with duplicate attribute names

2020-09-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22622?focusedWorklogId=477026=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-477026
 ]

ASF GitHub Bot logged work on HIVE-22622:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 07:13
Start Date: 01/Sep/20 07:13
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on pull request #1446:
URL: https://github.com/apache/hive/pull/1446#issuecomment-684505690


   Adding a test case for various underlying formats doesn't make sense this 
case because the duplicate check is performed in the semantical analysis phase 
of the `create table` statement and the error message would be the same. So if 
a duplicate found the code flow doesn't reach the point where the table is 
actually created.
   
   Added a test case for nested struct.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 477026)
Time Spent: 40m  (was: 0.5h)

> Hive allows to create a struct with duplicate attribute names
> -
>
> Key: HIVE-22622
> URL: https://issues.apache.org/jira/browse/HIVE-22622
> Project: Hive
>  Issue Type: Bug
>Reporter: Denys Kuzmenko
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When you create at table with a struct with twice the same attribute name, 
> hive allow you to create it.
> create table test_struct( duplicateColumn struct);
> You can insert data into it :
> insert into test_struct select named_struct("id",1,"id",1);
> But you can not read it :
> select * from test_struct;
> Return : java.io.IOException: java.io.IOException: Error reading file: 
> hdfs://.../test_struct/delta_001_001_/bucket_0 ,
> We can create and insert. but fail on read the Struct part of the tables. We 
> can still read all other columns (if we have more than one) but not the 
> struct anymore.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22622) Hive allows to create a struct with duplicate attribute names

2020-09-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22622?focusedWorklogId=477019=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-477019
 ]

ASF GitHub Bot logged work on HIVE-22622:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 07:03
Start Date: 01/Sep/20 07:03
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on a change in pull request #1446:
URL: https://github.com/apache/hive/pull/1446#discussion_r480897671



##
File path: common/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java
##
@@ -471,6 +471,7 @@
   "Not an ordered-set aggregate function: {0}. WITHIN GROUP clause is 
not allowed.", true),
   WITHIN_GROUP_PARAMETER_MISMATCH(10422,
   "The number of hypothetical direct arguments ({0}) must match the 
number of ordering columns ({1})", true),
+  AMBIGUOUS_STRUCT_FIELD(10423, "Struct field is not unique: {0}", true),

Review comment:
   Renamed `field` to `attribute`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 477019)
Time Spent: 0.5h  (was: 20m)

> Hive allows to create a struct with duplicate attribute names
> -
>
> Key: HIVE-22622
> URL: https://issues.apache.org/jira/browse/HIVE-22622
> Project: Hive
>  Issue Type: Bug
>Reporter: Denys Kuzmenko
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When you create at table with a struct with twice the same attribute name, 
> hive allow you to create it.
> create table test_struct( duplicateColumn struct);
> You can insert data into it :
> insert into test_struct select named_struct("id",1,"id",1);
> But you can not read it :
> select * from test_struct;
> Return : java.io.IOException: java.io.IOException: Error reading file: 
> hdfs://.../test_struct/delta_001_001_/bucket_0 ,
> We can create and insert. but fail on read the Struct part of the tables. We 
> can still read all other columns (if we have more than one) but not the 
> struct anymore.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-24099) unix_timestamp,intersect,except throws NPE

2020-09-01 Thread zhaolong (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaolong resolved HIVE-24099.
-
Resolution: Duplicate

https://issues.apache.org/jira/browse/HIVE-24060  
Repeated problem

> unix_timestamp,intersect,except throws NPE
> --
>
> Key: HIVE-24099
> URL: https://issues.apache.org/jira/browse/HIVE-24099
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 3.1.0
>Reporter: zhaolong
>Priority: Major
> Attachments: image-2020-09-01-10-22-07-549.png, 
> image-2020-09-01-10-26-14-062.png, image-2020-09-01-10-27-23-916.png
>
>
> unix_timestamp,intersect,except throws NPE when cbo is false and 
> optimize.constant.propagation is false
> reproduced problems:
>  1. unix_timestap:
>       set hive.cbo.enable=true;
>       set hive.optimize.constant.propagation=false;
>  {color:#00}     create table test_pt(idx string, namex string) 
> partitioned by(pt_dt string) stored as orc;{color}
> {color:#00} explain extended select count(1) from test_pt where pt_dt 
> = unix_timestamp();{color}
> {color:#00}!image-2020-09-01-10-22-07-549.png!{color}
> {color:#00}2.intersect{color}
> {color:#00} create table t1(id int, name string, score int);{color}
> create table t2(id int, name string, score int);
> insert into t1 values(1,'xiaoming', 98);
> insert into t2 values(2,'xiaohong', 95);
> select id from t1 intersect select id from t2;
> !image-2020-09-01-10-26-14-062.png!
> 3.except 
> select id from t1 except select id from t2;
>   !image-2020-09-01-10-27-23-916.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-24101) Invalid table alias order by columns(>=2) if cbo is false

2020-09-01 Thread zhaolong (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaolong resolved HIVE-24101.
-
Resolution: Duplicate

https://issues.apache.org/jira/browse/HIVE-24100  
Repeated problem

> Invalid table alias order by columns(>=2) if cbo is false
> -
>
> Key: HIVE-24101
> URL: https://issues.apache.org/jira/browse/HIVE-24101
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 3.1.0
>Reporter: zhaolong
>Priority: Major
> Attachments: image-2020-09-01-11-29-50-729.png
>
>
> create table a
> (
> item_id string, 
> stru_area_id string
> )partitioned by ( PT_DT string) stored as orc;
> create table b
> (
> CREATE_ORG_ID string,
> PROMOTION_ID string,
> PROMOTION_STATUS string
> ) partitioned by (pt_dt string) stored as orc;
> create table c
> (
> STRU_ID string,
> SUP_STRU string
> ) partitioned by(pt_dt string) stored as orc;
> set hive.cbo.enable=false;
> explain
> insert into table a partition( PT_DT = '2020-08-22' )
> (item_id , stru_area_id)
> select 
>  '' ITEM_ID , T.STRU_ID STRU_AREA_ID 
> from ( 
>  select 
>  STRU_ID STRU_ID ,T0.STRU_ID STRU_ID_BRANCH 
>  from c T0 
> ) T
> inner join ( 
>  select 
>  CREATE_ORG_ID
>  from b TT 
> ) TIV
> on ( STRU_ID_BRANCH = CREATE_ORG_ID ) 
> group by T.STRU_ID
> order by 1,2;
> !image-2020-09-01-11-29-50-729.png!
> if delete order by 1,2 it`s ok.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

55 matches

Mail list logo