from:"Ashutosh Chauhan \(Jira\)"

[jira] [Updated] (HIVE-23281) ObjectStore::convertToStorageDescriptor can be optimised to reduce calls to DB for ACID tables

2020-05-25 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23281:

Status: Open  (was: Patch Available)

> ObjectStore::convertToStorageDescriptor can be optimised to reduce calls to 
> DB for ACID tables
> --
>
> Key: HIVE-23281
> URL: https://issues.apache.org/jira/browse/HIVE-23281
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-23281.1.patch, HIVE-23281.2.patch, 
> HIVE-23281.3.patch, HIVE-23281.4.patch, HIVE-23281.5.patch, 
> HIVE-23281.6.patch, HIVE-23281.7.patch, HIVE-23281.8.patch, 
> HIVE-23281.9.patch, image-2020-04-23-13-56-17-210.png
>
>
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L1980]
>  
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L1982]
>  
> SkewInfo, bucketCols, ordering etc are not needed for ACID tables. It may be 
> good to check for transactional tables and get rid these calls in table 
> lookups.
>  
> This should help in reducing DB calls.
> !image-2020-04-23-13-56-17-210.png|width=669,height=485!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23281) ObjectStore::convertToStorageDescriptor can be optimised to reduce calls to DB for ACID tables

2020-05-25 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23281:

Attachment: HIVE-23281.9.patch

> ObjectStore::convertToStorageDescriptor can be optimised to reduce calls to 
> DB for ACID tables
> --
>
> Key: HIVE-23281
> URL: https://issues.apache.org/jira/browse/HIVE-23281
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-23281.1.patch, HIVE-23281.2.patch, 
> HIVE-23281.3.patch, HIVE-23281.4.patch, HIVE-23281.5.patch, 
> HIVE-23281.6.patch, HIVE-23281.7.patch, HIVE-23281.8.patch, 
> HIVE-23281.9.patch, image-2020-04-23-13-56-17-210.png
>
>
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L1980]
>  
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L1982]
>  
> SkewInfo, bucketCols, ordering etc are not needed for ACID tables. It may be 
> good to check for transactional tables and get rid these calls in table 
> lookups.
>  
> This should help in reducing DB calls.
> !image-2020-04-23-13-56-17-210.png|width=669,height=485!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23175) Skip serializing hadoop and tez config on HS side

2020-05-25 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116304#comment-17116304
 ] 

Ashutosh Chauhan commented on HIVE-23175:
-

[~mustafaiman] would you like to rebase your patch? Two new methods introduced 
in TezUtils in TEZ-4137 are static methods and we can duplicate those in Hive 
temporarily while waiting for a new Tez release.

> Skip serializing hadoop and tez config on HS side
> -
>
> Key: HIVE-23175
> URL: https://issues.apache.org/jira/browse/HIVE-23175
> Project: Hive
>  Issue Type: Improvement
>  Components: Tez
>Reporter: Mustafa Iman
>Assignee: Mustafa Iman
>Priority: Major
> Attachments: HIVE-23175.1.patch, HIVE-23175.2.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> HiveServer spends a lot of time serializing configuration objects. We can 
> skip putting hadoop and tez config xml files in payload assuming that the 
> configs are the same on both HS and Task side. This depends on Tez to load 
> local xml configs when creating config objects 
> [https://issues.apache.org/jira/browse/TEZ-4137] 
> Ideally we should be able to skip hive-site.xml too. However, if we skip 
> hive-site.xml at that stage, then we make wrong choices at tez dag build 
> stage due to missing configs.
> In the ideal version of this, we should not be both looking up configs and 
> putting new configs from and to the same config object at DAG and Vertex 
> build phases. Instead we should be looking up from a HS2's HiveConf object 
> and writing to a brand new JobConf for each vertex. That way we would not 
> have any unnecessary item in the jobconf for any vertex. However Dag and 
> Vertex build stages (TezTask#build) and a lot of other components called from 
> there treat a single config object both the source of HS2 side config and the 
> target JobConf that they are putting vertex level options into. It is very 
> hard to separate these concerns now.
> With this patch, we are reducing the size of JobConf (per vertex) by ~65%. It 
> should improve the transmit latency. However, most significant gains are at 
> CPU time while compressing job configs as the config objects are much smaller 
> now.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23536) Provide an option to skip stats generation for major compaction

2020-05-25 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23536:

Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Peter!

> Provide an option to skip stats generation for major compaction
> ---
>
> Key: HIVE-23536
> URL: https://issues.apache.org/jira/browse/HIVE-23536
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-23536.02.patch, HIVE-23536.03.patch, 
> HIVE-23536.patch
>
>
> Currently major MR compaction is regenerates stats every time if the column 
> stats table contains some data. Some configurations do not use stats but 
> because of historical reasons the column stats table can still contain some 
> data.
> We should provide a possibility to skip stats generation in these cases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23536) Provide an option to skip stats generation for major compaction

2020-05-25 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116301#comment-17116301
 ] 

Ashutosh Chauhan commented on HIVE-23536:
-

+1

> Provide an option to skip stats generation for major compaction
> ---
>
> Key: HIVE-23536
> URL: https://issues.apache.org/jira/browse/HIVE-23536
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
> Attachments: HIVE-23536.02.patch, HIVE-23536.03.patch, 
> HIVE-23536.patch
>
>
> Currently major MR compaction is regenerates stats every time if the column 
> stats table contains some data. Some configurations do not use stats but 
> because of historical reasons the column stats table can still contain some 
> data.
> We should provide a possibility to skip stats generation in these cases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23266) Remove QueryWrapper from ObjectStore

2020-05-25 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116297#comment-17116297
 ] 

Ashutosh Chauhan commented on HIVE-23266:
-

can you please create a Review Board for it?

> Remove QueryWrapper from ObjectStore
> 
>
> Key: HIVE-23266
> URL: https://issues.apache.org/jira/browse/HIVE-23266
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
> Attachments: HIVE-23266.1.patch, HIVE-23266.10.patch, 
> HIVE-23266.11.patch, HIVE-23266.2.patch, HIVE-23266.2.patch, 
> HIVE-23266.3.patch, HIVE-23266.4.patch, HIVE-23266.5.patch, 
> HIVE-23266.6.patch, HIVE-23266.7.patch, HIVE-23266.8.patch, HIVE-23266.9.patch
>
>
> There is currently a utility called {{QueryWrapper}} that makes a normal 
> {{Query}} auto-closable.  However, {{Query}} is now in fact already 
> auto-closing, so there is no need for this class.  In trying to remove it, I 
> realized that this wrapper was being passed around in pretty convoluted ways 
> and also it was sometimes being created in a {{try-with-resources}} block but 
> then never actually used in any way.
> Remove the {{QueryWrapper}} from the class and simplify some of the DB 
> interactions.
> https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L178



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23535) Bump Minimum Required Version of Maven to 3.0.5

2020-05-25 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23535:

Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, [~belugabehr]

> Bump Minimum Required Version of Maven to 3.0.5
> ---
>
> Key: HIVE-23535
> URL: https://issues.apache.org/jira/browse/HIVE-23535
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-23535.1.patch
>
>
> {code:xml|title=pom.xml}
>   
> 2.2.1
>   
> {code}
> Time to upgrade to 3.x
> {quote}In Maven 3, use Maven Enforcer Plugin's requireMavenVersion rule, or 
> other rules to check other aspects.
> {quote}
> [https://maven.apache.org/pom.html#prerequisites]
>  
> We get the Enforcer Plugin from the Apache Parent POM.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-21624) LLAP: Cpu metrics at thread level is broken

2020-05-25 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-21624:

Attachment: HIVE-21624.3.patch

> LLAP: Cpu metrics at thread level is broken
> ---
>
> Key: HIVE-21624
> URL: https://issues.apache.org/jira/browse/HIVE-21624
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Nita Dembla
>Assignee: Prasanth Jayachandran
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21624.1.patch, HIVE-21624.2.patch, 
> HIVE-21624.3.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> ExecutorThreadCPUTime and ExecutorThreadUserTime relies on thread mx bean cpu 
> metrics when available. At some point, the thread name which the metrics 
> publisher looks for has changed causing no metrics to be published for these 
> counters.  
> The above counters looks for thread with name starting with 
> "ContainerExecutor" but the llap task executor thread got changed to 
> "Task-Executor"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-21624) LLAP: Cpu metrics at thread level is broken

2020-05-25 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-21624:

Status: Open  (was: Patch Available)

> LLAP: Cpu metrics at thread level is broken
> ---
>
> Key: HIVE-21624
> URL: https://issues.apache.org/jira/browse/HIVE-21624
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Nita Dembla
>Assignee: Prasanth Jayachandran
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21624.1.patch, HIVE-21624.2.patch, 
> HIVE-21624.3.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> ExecutorThreadCPUTime and ExecutorThreadUserTime relies on thread mx bean cpu 
> metrics when available. At some point, the thread name which the metrics 
> publisher looks for has changed causing no metrics to be published for these 
> counters.  
> The above counters looks for thread with name starting with 
> "ContainerExecutor" but the llap task executor thread got changed to 
> "Task-Executor"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-21624) LLAP: Cpu metrics at thread level is broken

2020-05-25 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-21624:

Status: Patch Available  (was: Open)

> LLAP: Cpu metrics at thread level is broken
> ---
>
> Key: HIVE-21624
> URL: https://issues.apache.org/jira/browse/HIVE-21624
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Nita Dembla
>Assignee: Prasanth Jayachandran
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21624.1.patch, HIVE-21624.2.patch, 
> HIVE-21624.3.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> ExecutorThreadCPUTime and ExecutorThreadUserTime relies on thread mx bean cpu 
> metrics when available. At some point, the thread name which the metrics 
> publisher looks for has changed causing no metrics to be published for these 
> counters.  
> The above counters looks for thread with name starting with 
> "ContainerExecutor" but the llap task executor thread got changed to 
> "Task-Executor"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23281) ObjectStore::convertToStorageDescriptor can be optimised to reduce calls to DB for ACID tables

2020-05-25 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23281:

Status: Open  (was: Patch Available)

> ObjectStore::convertToStorageDescriptor can be optimised to reduce calls to 
> DB for ACID tables
> --
>
> Key: HIVE-23281
> URL: https://issues.apache.org/jira/browse/HIVE-23281
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-23281.1.patch, HIVE-23281.2.patch, 
> HIVE-23281.3.patch, HIVE-23281.4.patch, HIVE-23281.5.patch, 
> HIVE-23281.6.patch, HIVE-23281.7.patch, HIVE-23281.8.patch, 
> image-2020-04-23-13-56-17-210.png
>
>
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L1980]
>  
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L1982]
>  
> SkewInfo, bucketCols, ordering etc are not needed for ACID tables. It may be 
> good to check for transactional tables and get rid these calls in table 
> lookups.
>  
> This should help in reducing DB calls.
> !image-2020-04-23-13-56-17-210.png|width=669,height=485!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23281) ObjectStore::convertToStorageDescriptor can be optimised to reduce calls to DB for ACID tables

2020-05-25 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23281:

Attachment: (was: HIVE-23281.8.patch)

> ObjectStore::convertToStorageDescriptor can be optimised to reduce calls to 
> DB for ACID tables
> --
>
> Key: HIVE-23281
> URL: https://issues.apache.org/jira/browse/HIVE-23281
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-23281.1.patch, HIVE-23281.2.patch, 
> HIVE-23281.3.patch, HIVE-23281.4.patch, HIVE-23281.5.patch, 
> HIVE-23281.6.patch, HIVE-23281.7.patch, HIVE-23281.8.patch, 
> image-2020-04-23-13-56-17-210.png
>
>
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L1980]
>  
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L1982]
>  
> SkewInfo, bucketCols, ordering etc are not needed for ACID tables. It may be 
> good to check for transactional tables and get rid these calls in table 
> lookups.
>  
> This should help in reducing DB calls.
> !image-2020-04-23-13-56-17-210.png|width=669,height=485!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23281) ObjectStore::convertToStorageDescriptor can be optimised to reduce calls to DB for ACID tables

2020-05-25 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23281:

Status: Patch Available  (was: Open)

> ObjectStore::convertToStorageDescriptor can be optimised to reduce calls to 
> DB for ACID tables
> --
>
> Key: HIVE-23281
> URL: https://issues.apache.org/jira/browse/HIVE-23281
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-23281.1.patch, HIVE-23281.2.patch, 
> HIVE-23281.3.patch, HIVE-23281.4.patch, HIVE-23281.5.patch, 
> HIVE-23281.6.patch, HIVE-23281.7.patch, HIVE-23281.8.patch, 
> image-2020-04-23-13-56-17-210.png
>
>
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L1980]
>  
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L1982]
>  
> SkewInfo, bucketCols, ordering etc are not needed for ACID tables. It may be 
> good to check for transactional tables and get rid these calls in table 
> lookups.
>  
> This should help in reducing DB calls.
> !image-2020-04-23-13-56-17-210.png|width=669,height=485!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23281) ObjectStore::convertToStorageDescriptor can be optimised to reduce calls to DB for ACID tables

2020-05-25 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23281:

Attachment: HIVE-23281.8.patch

> ObjectStore::convertToStorageDescriptor can be optimised to reduce calls to 
> DB for ACID tables
> --
>
> Key: HIVE-23281
> URL: https://issues.apache.org/jira/browse/HIVE-23281
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-23281.1.patch, HIVE-23281.2.patch, 
> HIVE-23281.3.patch, HIVE-23281.4.patch, HIVE-23281.5.patch, 
> HIVE-23281.6.patch, HIVE-23281.7.patch, HIVE-23281.8.patch, 
> image-2020-04-23-13-56-17-210.png
>
>
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L1980]
>  
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L1982]
>  
> SkewInfo, bucketCols, ordering etc are not needed for ACID tables. It may be 
> good to check for transactional tables and get rid these calls in table 
> lookups.
>  
> This should help in reducing DB calls.
> !image-2020-04-23-13-56-17-210.png|width=669,height=485!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23281) ObjectStore::convertToStorageDescriptor can be optimised to reduce calls to DB for ACID tables

2020-05-25 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23281:

Attachment: HIVE-23281.8.patch

> ObjectStore::convertToStorageDescriptor can be optimised to reduce calls to 
> DB for ACID tables
> --
>
> Key: HIVE-23281
> URL: https://issues.apache.org/jira/browse/HIVE-23281
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-23281.1.patch, HIVE-23281.2.patch, 
> HIVE-23281.3.patch, HIVE-23281.4.patch, HIVE-23281.5.patch, 
> HIVE-23281.6.patch, HIVE-23281.7.patch, HIVE-23281.8.patch, 
> image-2020-04-23-13-56-17-210.png
>
>
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L1980]
>  
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L1982]
>  
> SkewInfo, bucketCols, ordering etc are not needed for ACID tables. It may be 
> good to check for transactional tables and get rid these calls in table 
> lookups.
>  
> This should help in reducing DB calls.
> !image-2020-04-23-13-56-17-210.png|width=669,height=485!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23281) ObjectStore::convertToStorageDescriptor can be optimised to reduce calls to DB for ACID tables

2020-05-25 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23281:

Status: Patch Available  (was: Open)

> ObjectStore::convertToStorageDescriptor can be optimised to reduce calls to 
> DB for ACID tables
> --
>
> Key: HIVE-23281
> URL: https://issues.apache.org/jira/browse/HIVE-23281
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-23281.1.patch, HIVE-23281.2.patch, 
> HIVE-23281.3.patch, HIVE-23281.4.patch, HIVE-23281.5.patch, 
> HIVE-23281.6.patch, HIVE-23281.7.patch, HIVE-23281.8.patch, 
> image-2020-04-23-13-56-17-210.png
>
>
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L1980]
>  
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L1982]
>  
> SkewInfo, bucketCols, ordering etc are not needed for ACID tables. It may be 
> good to check for transactional tables and get rid these calls in table 
> lookups.
>  
> This should help in reducing DB calls.
> !image-2020-04-23-13-56-17-210.png|width=669,height=485!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23281) ObjectStore::convertToStorageDescriptor can be optimised to reduce calls to DB for ACID tables

2020-05-25 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23281:

Status: Open  (was: Patch Available)

> ObjectStore::convertToStorageDescriptor can be optimised to reduce calls to 
> DB for ACID tables
> --
>
> Key: HIVE-23281
> URL: https://issues.apache.org/jira/browse/HIVE-23281
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-23281.1.patch, HIVE-23281.2.patch, 
> HIVE-23281.3.patch, HIVE-23281.4.patch, HIVE-23281.5.patch, 
> HIVE-23281.6.patch, HIVE-23281.7.patch, image-2020-04-23-13-56-17-210.png
>
>
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L1980]
>  
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L1982]
>  
> SkewInfo, bucketCols, ordering etc are not needed for ACID tables. It may be 
> good to check for transactional tables and get rid these calls in table 
> lookups.
>  
> This should help in reducing DB calls.
> !image-2020-04-23-13-56-17-210.png|width=669,height=485!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23480) Test may fail due to a incorrect usage of a third party library

2020-05-25 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23480:

Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Panos!

> Test may fail due to a incorrect usage of a third party library
> ---
>
> Key: HIVE-23480
> URL: https://issues.apache.org/jira/browse/HIVE-23480
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Metastore
>Reporter: contextshuffling
>Assignee: Panagiotis Garefalakis
>Priority: Minor
> Fix For: 4.0.0
>
> Attachments: HIVE-23480.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Tests 
> {{org.apache.hadoop.hive.common.TestStatsSetupConst#testStatColumnEntriesCompat}}
>  replies on Jackson to serialize the params to string. However, Jackson 
> library uses reflection API {{getDeclaredFields}} but it does not guarantee 
> any specific order of returned field so the order of fields in the json 
> string might change, and thus, test can fail or pass without any changes to 
> the code.
> An example error message:
> {code:java}
> org.junit.ComparisonFailure: 
> expected:<{"[BASIC_STATS":"true","COLUMN_STATS":{"Foo":"true"}]}> but 
> was:<{"[COLUMN_STATS":{"Foo":"true"},"BASIC_STATS":"true"]}>
> at 
> org.apache.hadoop.hive.common.TestStatsSetupConst.testStatColumnEntriesCompat(TestStatsSetupConst.java:76)
> {code}
> Ideally, this test should not reply on the order returned by this API so that 
> it generates a deterministic result.
> An potential solution is to use library like 
> https://github.com/skyscreamer/JSONassert to compare string in a 
> order-agnostic way.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23480) Test may fail due to a incorrect usage of a third party library

2020-05-25 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17115774#comment-17115774
 ] 

Ashutosh Chauhan commented on HIVE-23480:
-

+1

> Test may fail due to a incorrect usage of a third party library
> ---
>
> Key: HIVE-23480
> URL: https://issues.apache.org/jira/browse/HIVE-23480
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Metastore
>Reporter: contextshuffling
>Assignee: Panagiotis Garefalakis
>Priority: Minor
> Attachments: HIVE-23480.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Tests 
> {{org.apache.hadoop.hive.common.TestStatsSetupConst#testStatColumnEntriesCompat}}
>  replies on Jackson to serialize the params to string. However, Jackson 
> library uses reflection API {{getDeclaredFields}} but it does not guarantee 
> any specific order of returned field so the order of fields in the json 
> string might change, and thus, test can fail or pass without any changes to 
> the code.
> An example error message:
> {code:java}
> org.junit.ComparisonFailure: 
> expected:<{"[BASIC_STATS":"true","COLUMN_STATS":{"Foo":"true"}]}> but 
> was:<{"[COLUMN_STATS":{"Foo":"true"},"BASIC_STATS":"true"]}>
> at 
> org.apache.hadoop.hive.common.TestStatsSetupConst.testStatColumnEntriesCompat(TestStatsSetupConst.java:76)
> {code}
> Ideally, this test should not reply on the order returned by this API so that 
> it generates a deterministic result.
> An potential solution is to use library like 
> https://github.com/skyscreamer/JSONassert to compare string in a 
> order-agnostic way.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23536) Provide an option to skip stats generation for major compaction

2020-05-25 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17115768#comment-17115768
 ] 

Ashutosh Chauhan commented on HIVE-23536:
-

+1

> Provide an option to skip stats generation for major compaction
> ---
>
> Key: HIVE-23536
> URL: https://issues.apache.org/jira/browse/HIVE-23536
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
> Attachments: HIVE-23536.02.patch, HIVE-23536.03.patch, 
> HIVE-23536.patch
>
>
> Currently major MR compaction is regenerates stats every time if the column 
> stats table contains some data. Some configurations do not use stats but 
> because of historical reasons the column stats table can still contain some 
> data.
> We should provide a possibility to skip stats generation in these cases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23468) LLAP: Optimise OrcEncodedDataReader to avoid FS init to NN

2020-05-24 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23468:

Status: Patch Available  (was: Open)

> LLAP: Optimise OrcEncodedDataReader to avoid FS init to NN
> --
>
> Key: HIVE-23468
> URL: https://issues.apache.org/jira/browse/HIVE-23468
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-23468.1.patch, HIVE-23468.2.patch, 
> HIVE-23468.3.patch
>
>
> OrcEncodedDataReader materializes the supplier to check if it is a HDFS 
> system or not. This causes unwanted call to NN even in cases when cache is 
> completely warmed up.
> [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java#L540]
> [https://github.com/apache/hive/blob/9f40d7cc1d889aa3079f3f494cf810fabe326e44/ql/src/java/org/apache/hadoop/hive/ql/io/HdfsUtils.java#L107]
> Workaround is to set "hive.llap.io.use.fileid.path=false" to avoid this case.
> IO elevator could get 100% cache hit from FileSystem impl in warmed up 
> scenario.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23468) LLAP: Optimise OrcEncodedDataReader to avoid FS init to NN

2020-05-24 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23468:

Attachment: HIVE-23468.3.patch

> LLAP: Optimise OrcEncodedDataReader to avoid FS init to NN
> --
>
> Key: HIVE-23468
> URL: https://issues.apache.org/jira/browse/HIVE-23468
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-23468.1.patch, HIVE-23468.2.patch, 
> HIVE-23468.3.patch
>
>
> OrcEncodedDataReader materializes the supplier to check if it is a HDFS 
> system or not. This causes unwanted call to NN even in cases when cache is 
> completely warmed up.
> [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java#L540]
> [https://github.com/apache/hive/blob/9f40d7cc1d889aa3079f3f494cf810fabe326e44/ql/src/java/org/apache/hadoop/hive/ql/io/HdfsUtils.java#L107]
> Workaround is to set "hive.llap.io.use.fileid.path=false" to avoid this case.
> IO elevator could get 100% cache hit from FileSystem impl in warmed up 
> scenario.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23468) LLAP: Optimise OrcEncodedDataReader to avoid FS init to NN

2020-05-24 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23468:

Assignee: Rajesh Balamohan
  Status: Open  (was: Patch Available)

> LLAP: Optimise OrcEncodedDataReader to avoid FS init to NN
> --
>
> Key: HIVE-23468
> URL: https://issues.apache.org/jira/browse/HIVE-23468
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-23468.1.patch, HIVE-23468.2.patch, 
> HIVE-23468.3.patch
>
>
> OrcEncodedDataReader materializes the supplier to check if it is a HDFS 
> system or not. This causes unwanted call to NN even in cases when cache is 
> completely warmed up.
> [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java#L540]
> [https://github.com/apache/hive/blob/9f40d7cc1d889aa3079f3f494cf810fabe326e44/ql/src/java/org/apache/hadoop/hive/ql/io/HdfsUtils.java#L107]
> Workaround is to set "hive.llap.io.use.fileid.path=false" to avoid this case.
> IO elevator could get 100% cache hit from FileSystem impl in warmed up 
> scenario.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23535) Bump Minimum Required Version of Maven to 3.0.5

2020-05-24 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17115713#comment-17115713
 ] 

Ashutosh Chauhan commented on HIVE-23535:
-

+1
I assume {{requireMavenVersion}} will be added in a follow-up.

> Bump Minimum Required Version of Maven to 3.0.5
> ---
>
> Key: HIVE-23535
> URL: https://issues.apache.org/jira/browse/HIVE-23535
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
> Attachments: HIVE-23535.1.patch
>
>
> {code:xml|title=pom.xml}
>   
> 2.2.1
>   
> {code}
> Time to upgrade to 3.x
> {quote}In Maven 3, use Maven Enforcer Plugin's requireMavenVersion rule, or 
> other rules to check other aspects.
> {quote}
> [https://maven.apache.org/pom.html#prerequisites]
>  
> We get the Enforcer Plugin from the Apache Parent POM.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-21624) LLAP: Cpu metrics at thread level is broken

2020-05-24 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-21624:

Attachment: HIVE-21624.2.patch

> LLAP: Cpu metrics at thread level is broken
> ---
>
> Key: HIVE-21624
> URL: https://issues.apache.org/jira/browse/HIVE-21624
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Nita Dembla
>Assignee: Prasanth Jayachandran
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21624.1.patch, HIVE-21624.2.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> ExecutorThreadCPUTime and ExecutorThreadUserTime relies on thread mx bean cpu 
> metrics when available. At some point, the thread name which the metrics 
> publisher looks for has changed causing no metrics to be published for these 
> counters.  
> The above counters looks for thread with name starting with 
> "ContainerExecutor" but the llap task executor thread got changed to 
> "Task-Executor"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-21624) LLAP: Cpu metrics at thread level is broken

2020-05-24 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-21624:

Status: Patch Available  (was: Open)

> LLAP: Cpu metrics at thread level is broken
> ---
>
> Key: HIVE-21624
> URL: https://issues.apache.org/jira/browse/HIVE-21624
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Nita Dembla
>Assignee: Prasanth Jayachandran
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21624.1.patch, HIVE-21624.2.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> ExecutorThreadCPUTime and ExecutorThreadUserTime relies on thread mx bean cpu 
> metrics when available. At some point, the thread name which the metrics 
> publisher looks for has changed causing no metrics to be published for these 
> counters.  
> The above counters looks for thread with name starting with 
> "ContainerExecutor" but the llap task executor thread got changed to 
> "Task-Executor"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-21624) LLAP: Cpu metrics at thread level is broken

2020-05-24 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-21624:

Status: Open  (was: Patch Available)

> LLAP: Cpu metrics at thread level is broken
> ---
>
> Key: HIVE-21624
> URL: https://issues.apache.org/jira/browse/HIVE-21624
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Nita Dembla
>Assignee: Prasanth Jayachandran
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21624.1.patch, HIVE-21624.2.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> ExecutorThreadCPUTime and ExecutorThreadUserTime relies on thread mx bean cpu 
> metrics when available. At some point, the thread name which the metrics 
> publisher looks for has changed causing no metrics to be published for these 
> counters.  
> The above counters looks for thread with name starting with 
> "ContainerExecutor" but the llap task executor thread got changed to 
> "Task-Executor"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23529) CTAS is broken for uniontype when row_deserialize

2020-05-24 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23529:

Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Mustafa!

> CTAS is broken for uniontype when row_deserialize
> -
>
> Key: HIVE-23529
> URL: https://issues.apache.org/jira/browse/HIVE-23529
> Project: Hive
>  Issue Type: Bug
>Reporter: Mustafa Iman
>Assignee: Mustafa Iman
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-23529.patch, HIVE-23529.patch
>
>
> CTAS queries fail when there is a uniontype in source table and 
> hive.vectorized.use.vector.serde.deserialize=false.
> ObjectInspectorUtils.copyToStandardObject in ROW_DESERIALIZE path extracts 
> the value from union type. However, VectorAssignRow expects a StandardUnion 
> object causing ClassCastException for any CTAS query.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23529) CTAS is broken for uniontype when row_deserialize

2020-05-24 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17115666#comment-17115666
 ] 

Ashutosh Chauhan commented on HIVE-23529:
-

+1

> CTAS is broken for uniontype when row_deserialize
> -
>
> Key: HIVE-23529
> URL: https://issues.apache.org/jira/browse/HIVE-23529
> Project: Hive
>  Issue Type: Bug
>Reporter: Mustafa Iman
>Assignee: Mustafa Iman
>Priority: Major
> Attachments: HIVE-23529.patch, HIVE-23529.patch
>
>
> CTAS queries fail when there is a uniontype in source table and 
> hive.vectorized.use.vector.serde.deserialize=false.
> ObjectInspectorUtils.copyToStandardObject in ROW_DESERIALIZE path extracts 
> the value from union type. However, VectorAssignRow expects a StandardUnion 
> object causing ClassCastException for any CTAS query.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23536) Provide an option to skip stats generation for major compaction

2020-05-24 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17115664#comment-17115664
 ] 

Ashutosh Chauhan commented on HIVE-23536:
-

since we now have query based compactor, its better to name the config as 
{{hive.mr.compactor.gather.stats}}

> Provide an option to skip stats generation for major compaction
> ---
>
> Key: HIVE-23536
> URL: https://issues.apache.org/jira/browse/HIVE-23536
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
> Attachments: HIVE-23536.02.patch, HIVE-23536.patch
>
>
> Currently major MR compaction is regenerates stats every time if the column 
> stats table contains some data. Some configurations do not use stats but 
> because of historical reasons the column stats table can still contain some 
> data.
> We should provide a possibility to skip stats generation in these cases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23494) Upgrade Apache parent POM to version 23

2020-05-24 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23494:

Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, David!

> Upgrade Apache parent POM to version 23
> ---
>
> Key: HIVE-23494
> URL: https://issues.apache.org/jira/browse/HIVE-23494
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-23494.1.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23494) Upgrade Apache parent POM to version 23

2020-05-24 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17115660#comment-17115660
 ] 

Ashutosh Chauhan commented on HIVE-23494:
-

+1

> Upgrade Apache parent POM to version 23
> ---
>
> Key: HIVE-23494
> URL: https://issues.apache.org/jira/browse/HIVE-23494
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
> Attachments: HIVE-23494.1.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23457) Hive Incorrect result with subquery while optimizer misses the aggregation stage

2020-05-24 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17115659#comment-17115659
 ] 

Ashutosh Chauhan commented on HIVE-23457:
-

This behavior is correct as per standard. Sql standard does *not* allow order 
by in subquery. You may only have order in outermost query. 
As a result different Databases have different behavior all of which can be 
argued as compliant. SQL Server e.g, throws syntax error if you add order by in 
subquery. MySQL ignores it :  
https://mariadb.com/kb/en/why-is-order-by-in-a-from-subquery-ignored/
So Hive's behavior can be considered as compliant.
Also, relevant to note that Limit clause is not part of standard either.

> Hive Incorrect result with subquery while optimizer misses the aggregation 
> stage 
> -
>
> Key: HIVE-23457
> URL: https://issues.apache.org/jira/browse/HIVE-23457
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.2.0
>Reporter: Rajkumar Singh
>Priority: Critical
>
> Steps to Repro:
> {code:java}
> create table abc (id int);
> insert into table abc values (1),(2),(3),(4),(5),(6);
> select * from abc order by id desc
> 6
> 5
> 4
> 3
> 2
> 1
> select `id` from (select * from abc order by id desc ) as tmp;
> 1
> 2
> 3
> 4
> 5
> 6
>  
> {code}
> looking at the query plan it seems while using the subquery optimizer missed 
> the aggregation stage, I cant see any reduce stage.
> {code:java}
> set hive.query.results.cache.enabled=false;
> explain select * from abc order by id desc;
> ++
> |  Explain   |
> ++
> | Plan optimized by CBO. |
> ||
> | Vertex dependency in root stage|
> | Reducer 2 <- Map 1 (SIMPLE_EDGE)   |
> ||
> | Stage-0|
> |   Fetch Operator   |
> | limit:-1   |
> | Stage-1|
> |   Reducer 2 vectorized |
> |   File Output Operator [FS_8]  |
> | Select Operator [SEL_7] (rows=6 width=4)   |
> |   Output:["_col0"] |
> | <-Map 1 [SIMPLE_EDGE] vectorized   |
> |   SHUFFLE [RS_6]   |
> | Select Operator [SEL_5] (rows=6 width=4) |
> |   Output:["_col0"] |
> |   TableScan [TS_0] (rows=6 width=4)|
> | default@abc,abc, ACID 
> table,Tbl:COMPLETE,Col:COMPLETE,Output:["id"] |
> ||
> ++
> explain select `id` from (select * from abc order by id desc ) as tmp;
> +--+
> |   Explain|
> +--+
> | Plan optimized by CBO.   |
> |  |
> | Stage-0  |
> |   Fetch Operator |
> | limit:-1 |
> | Select Operator [SEL_1]  |
> |   Output:["_col0"]   |
> |   TableScan [TS_0]   |
> | Output:["id"]|
> |  |
> +--+
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23494) Upgrade Apache parent POM to version 23

2020-05-22 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23494:

Status: Patch Available  (was: Open)

> Upgrade Apache parent POM to version 23
> ---
>
> Key: HIVE-23494
> URL: https://issues.apache.org/jira/browse/HIVE-23494
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
> Attachments: HIVE-23494.1.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22066) Upgrade Apache parent POM to version 21

2020-05-21 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17113463#comment-17113463
 ] 

Ashutosh Chauhan commented on HIVE-22066:
-

+1

> Upgrade Apache parent POM to version 21
> ---
>
> Key: HIVE-22066
> URL: https://issues.apache.org/jira/browse/HIVE-22066
> Project: Hive
>  Issue Type: Improvement
>Reporter: Jesus Camacho Rodriguez
>Assignee: David Mollitor
>Priority: Major
> Attachments: HIVE-22066.10.patch, HIVE-22066.11.patch, 
> HIVE-22066.12.patch, HIVE-22066.2.patch, HIVE-22066.3.patch, 
> HIVE-22066.4.patch, HIVE-22066.5.patch, HIVE-22066.6.patch, 
> HIVE-22066.6.patch, HIVE-22066.7.patch, HIVE-22066.8.patch, 
> HIVE-22066.8.patch, HIVE-22066.9.patch, HIVE-22066.999.patch, 
> HIVE-22066.patch, HIVE-22066.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23501) AOOB in VectorDeserializeRow when complex types are converted to primitive types

2020-05-20 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23501:

Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Ramesh!

> AOOB in VectorDeserializeRow when complex types are converted to primitive 
> types
> 
>
> Key: HIVE-23501
> URL: https://issues.apache.org/jira/browse/HIVE-23501
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-23501.1.patch, HIVE-23501.2.patch, 
> HIVE-23501.3.patch
>
>
> AOOB in VectorDeserializeRow when complex types are converted to primitive 
> types



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23501) AOOB in VectorDeserializeRow when complex types are converted to primitive types

2020-05-19 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17111706#comment-17111706
 ] 

Ashutosh Chauhan commented on HIVE-23501:
-

+1

> AOOB in VectorDeserializeRow when complex types are converted to primitive 
> types
> 
>
> Key: HIVE-23501
> URL: https://issues.apache.org/jira/browse/HIVE-23501
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-23501.1.patch, HIVE-23501.2.patch, 
> HIVE-23501.3.patch
>
>
> AOOB in VectorDeserializeRow when complex types are converted to primitive 
> types



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23488) Optimise PartitionManagementTask::Msck::repair

2020-05-17 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17109917#comment-17109917
 ] 

Ashutosh Chauhan commented on HIVE-23488:
-

+1

> Optimise PartitionManagementTask::Msck::repair
> --
>
> Key: HIVE-23488
> URL: https://issues.apache.org/jira/browse/HIVE-23488
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Major
> Attachments: HIVE-23488.1.patch, Screenshot 2020-05-18 at 5.06.15 
> AM.png
>
>
> Ends up fetching table information twice.
> !Screenshot 2020-05-18 at 5.06.15 AM.png|width=1084,height=754!
>  
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/Msck.java#L113]
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreChecker.java#L234]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23468) LLAP: Optimise OrcEncodedDataReader to avoid FS init to NN

2020-05-17 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23468:

Attachment: HIVE-23468.2.patch

> LLAP: Optimise OrcEncodedDataReader to avoid FS init to NN
> --
>
> Key: HIVE-23468
> URL: https://issues.apache.org/jira/browse/HIVE-23468
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-23468.1.patch, HIVE-23468.2.patch
>
>
> OrcEncodedDataReader materializes the supplier to check if it is a HDFS 
> system or not. This causes unwanted call to NN even in cases when cache is 
> completely warmed up.
> [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java#L540]
> [https://github.com/apache/hive/blob/9f40d7cc1d889aa3079f3f494cf810fabe326e44/ql/src/java/org/apache/hadoop/hive/ql/io/HdfsUtils.java#L107]
> Workaround is to set "hive.llap.io.use.fileid.path=false" to avoid this case.
> IO elevator could get 100% cache hit from FileSystem impl in warmed up 
> scenario.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23468) LLAP: Optimise OrcEncodedDataReader to avoid FS init to NN

2020-05-17 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23468:

Status: Open  (was: Patch Available)

> LLAP: Optimise OrcEncodedDataReader to avoid FS init to NN
> --
>
> Key: HIVE-23468
> URL: https://issues.apache.org/jira/browse/HIVE-23468
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-23468.1.patch, HIVE-23468.2.patch
>
>
> OrcEncodedDataReader materializes the supplier to check if it is a HDFS 
> system or not. This causes unwanted call to NN even in cases when cache is 
> completely warmed up.
> [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java#L540]
> [https://github.com/apache/hive/blob/9f40d7cc1d889aa3079f3f494cf810fabe326e44/ql/src/java/org/apache/hadoop/hive/ql/io/HdfsUtils.java#L107]
> Workaround is to set "hive.llap.io.use.fileid.path=false" to avoid this case.
> IO elevator could get 100% cache hit from FileSystem impl in warmed up 
> scenario.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23468) LLAP: Optimise OrcEncodedDataReader to avoid FS init to NN

2020-05-17 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23468:

Status: Patch Available  (was: Open)

> LLAP: Optimise OrcEncodedDataReader to avoid FS init to NN
> --
>
> Key: HIVE-23468
> URL: https://issues.apache.org/jira/browse/HIVE-23468
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-23468.1.patch, HIVE-23468.2.patch
>
>
> OrcEncodedDataReader materializes the supplier to check if it is a HDFS 
> system or not. This causes unwanted call to NN even in cases when cache is 
> completely warmed up.
> [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java#L540]
> [https://github.com/apache/hive/blob/9f40d7cc1d889aa3079f3f494cf810fabe326e44/ql/src/java/org/apache/hadoop/hive/ql/io/HdfsUtils.java#L107]
> Workaround is to set "hive.llap.io.use.fileid.path=false" to avoid this case.
> IO elevator could get 100% cache hit from FileSystem impl in warmed up 
> scenario.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23281) ObjectStore::convertToStorageDescriptor can be optimised to reduce calls to DB for ACID tables

2020-05-17 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17109913#comment-17109913
 ] 

Ashutosh Chauhan commented on HIVE-23281:
-

This patch fixes following from original patch:
a) Compiler was always looking for skewed col info, even when its not needed 
and not returned from metastore which was causing NPE
b) Bucket cols are needed for Acid tables, since Acid tables could be bucketed. 
Was causing failures.
c) We need to also implement JDO path necessarily since tests are run to make 
sure results are identical from two. Not having it implemented caused test 
fails. 

> ObjectStore::convertToStorageDescriptor can be optimised to reduce calls to 
> DB for ACID tables
> --
>
> Key: HIVE-23281
> URL: https://issues.apache.org/jira/browse/HIVE-23281
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-23281.1.patch, HIVE-23281.2.patch, 
> image-2020-04-23-13-56-17-210.png
>
>
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L1980]
>  
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L1982]
>  
> SkewInfo, bucketCols, ordering etc are not needed for ACID tables. It may be 
> good to check for transactional tables and get rid these calls in table 
> lookups.
>  
> This should help in reducing DB calls.
> !image-2020-04-23-13-56-17-210.png|width=669,height=485!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23281) ObjectStore::convertToStorageDescriptor can be optimised to reduce calls to DB for ACID tables

2020-05-17 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23281:

Attachment: HIVE-23281.2.patch

> ObjectStore::convertToStorageDescriptor can be optimised to reduce calls to 
> DB for ACID tables
> --
>
> Key: HIVE-23281
> URL: https://issues.apache.org/jira/browse/HIVE-23281
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-23281.1.patch, HIVE-23281.2.patch, 
> image-2020-04-23-13-56-17-210.png
>
>
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L1980]
>  
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L1982]
>  
> SkewInfo, bucketCols, ordering etc are not needed for ACID tables. It may be 
> good to check for transactional tables and get rid these calls in table 
> lookups.
>  
> This should help in reducing DB calls.
> !image-2020-04-23-13-56-17-210.png|width=669,height=485!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23281) ObjectStore::convertToStorageDescriptor can be optimised to reduce calls to DB for ACID tables

2020-05-17 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23281:

Status: Patch Available  (was: Open)

> ObjectStore::convertToStorageDescriptor can be optimised to reduce calls to 
> DB for ACID tables
> --
>
> Key: HIVE-23281
> URL: https://issues.apache.org/jira/browse/HIVE-23281
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-23281.1.patch, HIVE-23281.2.patch, 
> image-2020-04-23-13-56-17-210.png
>
>
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L1980]
>  
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L1982]
>  
> SkewInfo, bucketCols, ordering etc are not needed for ACID tables. It may be 
> good to check for transactional tables and get rid these calls in table 
> lookups.
>  
> This should help in reducing DB calls.
> !image-2020-04-23-13-56-17-210.png|width=669,height=485!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23281) ObjectStore::convertToStorageDescriptor can be optimised to reduce calls to DB for ACID tables

2020-05-17 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23281:

Status: Open  (was: Patch Available)

> ObjectStore::convertToStorageDescriptor can be optimised to reduce calls to 
> DB for ACID tables
> --
>
> Key: HIVE-23281
> URL: https://issues.apache.org/jira/browse/HIVE-23281
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-23281.1.patch, image-2020-04-23-13-56-17-210.png
>
>
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L1980]
>  
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L1982]
>  
> SkewInfo, bucketCols, ordering etc are not needed for ACID tables. It may be 
> good to check for transactional tables and get rid these calls in table 
> lookups.
>  
> This should help in reducing DB calls.
> !image-2020-04-23-13-56-17-210.png|width=669,height=485!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23468) LLAP: Optimise OrcEncodedDataReader to avoid FS init to NN

2020-05-17 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17109897#comment-17109897
 ] 

Ashutosh Chauhan commented on HIVE-23468:
-

+1

> LLAP: Optimise OrcEncodedDataReader to avoid FS init to NN
> --
>
> Key: HIVE-23468
> URL: https://issues.apache.org/jira/browse/HIVE-23468
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-23468.1.patch
>
>
> OrcEncodedDataReader materializes the supplier to check if it is a HDFS 
> system or not. This causes unwanted call to NN even in cases when cache is 
> completely warmed up.
> [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java#L540]
> [https://github.com/apache/hive/blob/9f40d7cc1d889aa3079f3f494cf810fabe326e44/ql/src/java/org/apache/hadoop/hive/ql/io/HdfsUtils.java#L107]
> Workaround is to set "hive.llap.io.use.fileid.path=false" to avoid this case.
> IO elevator could get 100% cache hit from FileSystem impl in warmed up 
> scenario.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23476) LLAP: Preallocate arenas for mmap case as well

2020-05-17 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23476:

Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Prasanth!

> LLAP: Preallocate arenas for mmap case as well
> --
>
> Key: HIVE-23476
> URL: https://issues.apache.org/jira/browse/HIVE-23476
> Project: Hive
>  Issue Type: Bug
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-23476.1.patch, HIVE-23476.2.patch
>
>
> BuddyAllocator pre-allocation of arenas does not happen for mmap cache case. 
> Since we are not filling up the mmap'ed buffers the upfront allocations in 
> constructor is cheap. This can avoid lock free allocation of arenas later in 
> the code. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23478) Fix flaky special_character_in_tabnames_quotes_1 test

2020-05-17 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23478:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, John!

> Fix flaky special_character_in_tabnames_quotes_1 test
> -
>
> Key: HIVE-23478
> URL: https://issues.apache.org/jira/browse/HIVE-23478
> Project: Hive
>  Issue Type: Improvement
>  Components: Tests
>Affects Versions: 4.0.0
>Reporter: John Sherman
>Assignee: John Sherman
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-23478.1.patch, HIVE-23478.2.patch, 
> HIVE-23478.3.patch
>
>
> While testing https://issues.apache.org/jira/browse/HIVE-23354 
> special_character_in_tabnames_quotes_1 failed. Searching for the test, it 
> seems other patches have also had failures. I noticed that 
> special_character_in_tabnames_1 and special_character_in_tabnames_quotes_1 
> use the same database/table names. I suspect this is responsible for some of 
> the flakiness.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23292) Reduce PartitionDesc payload in MapWork

2020-05-17 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23292:

Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Ramesh & Rajesh!

> Reduce PartitionDesc payload in MapWork
> ---
>
> Key: HIVE-23292
> URL: https://issues.apache.org/jira/browse/HIVE-23292
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-23292.1.patch, HIVE-23292.1.patch, 
> HIVE-23292.2.patch, HIVE-23292.3.patch, HIVE-23292.4.patch
>
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java#L105



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23478) Fix flaky special_character_in_tabnames_quotes_1 test

2020-05-17 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17109597#comment-17109597
 ] 

Ashutosh Chauhan commented on HIVE-23478:
-

+1

> Fix flaky special_character_in_tabnames_quotes_1 test
> -
>
> Key: HIVE-23478
> URL: https://issues.apache.org/jira/browse/HIVE-23478
> Project: Hive
>  Issue Type: Improvement
>  Components: Tests
>Affects Versions: 4.0.0
>Reporter: John Sherman
>Assignee: John Sherman
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-23478.1.patch, HIVE-23478.2.patch, 
> HIVE-23478.3.patch
>
>
> While testing https://issues.apache.org/jira/browse/HIVE-23354 
> special_character_in_tabnames_quotes_1 failed. Searching for the test, it 
> seems other patches have also had failures. I noticed that 
> special_character_in_tabnames_1 and special_character_in_tabnames_quotes_1 
> use the same database/table names. I suspect this is responsible for some of 
> the flakiness.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23478) Fix flaky special_character_in_tabnames_quotes_1 test

2020-05-17 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17109577#comment-17109577
 ] 

Ashutosh Chauhan commented on HIVE-23478:
-

Doing both won't hurt and makes test more resilient. So, can add unique name as 
well as order by.

> Fix flaky special_character_in_tabnames_quotes_1 test
> -
>
> Key: HIVE-23478
> URL: https://issues.apache.org/jira/browse/HIVE-23478
> Project: Hive
>  Issue Type: Improvement
>  Components: Tests
>Affects Versions: 4.0.0
>Reporter: John Sherman
>Assignee: John Sherman
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-23478.1.patch, HIVE-23478.2.patch
>
>
> While testing https://issues.apache.org/jira/browse/HIVE-23354 
> special_character_in_tabnames_quotes_1 failed. Searching for the test, it 
> seems other patches have also had failures. I noticed that 
> special_character_in_tabnames_1 and special_character_in_tabnames_quotes_1 
> use the same database/table names. I suspect this is responsible for some of 
> the flakiness.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23478) Fix flaky special_character_in_tabnames_quotes_1 test

2020-05-17 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17109544#comment-17109544
 ] 

Ashutosh Chauhan commented on HIVE-23478:
-

I also hit this flaky test in one of my patches. Failure I saw was following :
{code}
java.lang.AssertionError: 
Client Execution succeeded but contained differences (error code = 1) after 
executing special_character_in_tabnames_quotes_1.q 
15534c15534
<  14   2
---
>  14   2
{code}

I dont think using unique name will suffice. Issue seems to be order of rows in 
test output. It probably needs fix similar to HIVE-23484

> Fix flaky special_character_in_tabnames_quotes_1 test
> -
>
> Key: HIVE-23478
> URL: https://issues.apache.org/jira/browse/HIVE-23478
> Project: Hive
>  Issue Type: Improvement
>  Components: Tests
>Affects Versions: 4.0.0
>Reporter: John Sherman
>Assignee: John Sherman
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-23478.1.patch, HIVE-23478.2.patch
>
>
> While testing https://issues.apache.org/jira/browse/HIVE-23354 
> special_character_in_tabnames_quotes_1 failed. Searching for the test, it 
> seems other patches have also had failures. I noticed that 
> special_character_in_tabnames_1 and special_character_in_tabnames_quotes_1 
> use the same database/table names. I suspect this is responsible for some of 
> the flakiness.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23443) LLAP speculative task pre-emption seems to be not working

2020-05-17 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23443:

Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Prasanth!

> LLAP speculative task pre-emption seems to be not working
> -
>
> Key: HIVE-23443
> URL: https://issues.apache.org/jira/browse/HIVE-23443
> Project: Hive
>  Issue Type: Bug
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-23443.1.patch, HIVE-23443.2.patch, 
> HIVE-23443.3.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> I think after HIVE-23210 we are getting a stable sort order and it is causing 
> pre-emption to not work in certain cases.
> {code:java}
> "attempt_1589167813851__119_01_08_0 
> (hive_20200511055921_89598f09-19f1-4969-ab7a-82e2dd796273-119/Map 1, started 
> at 2020-05-11 05:59:22, in preemption queue, can finish)", 
> "attempt_1589167813851_0008_84_01_08_1 
> (hive_20200511055928_7ae29ca3-e67d-4d1f-b193-05651023b503-84/Map 1, started 
> at 2020-05-11 06:00:23, in preemption queue, can finish)" {code}
> Scheduler only peek's at the pre-emption queue and looks at whether it is 
> non-finishable. 
> [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskExecutorService.java#L420]
> In the above case, all tasks are speculative but state change is not 
> triggering pre-emption queue re-ordering so peek() always returns canFinish 
> task even though non-finishable tasks are in the queue. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23476) LLAP: Preallocate arenas for mmap case as well

2020-05-17 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23476:

Status: Patch Available  (was: Open)

> LLAP: Preallocate arenas for mmap case as well
> --
>
> Key: HIVE-23476
> URL: https://issues.apache.org/jira/browse/HIVE-23476
> Project: Hive
>  Issue Type: Bug
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-23476.1.patch, HIVE-23476.2.patch
>
>
> BuddyAllocator pre-allocation of arenas does not happen for mmap cache case. 
> Since we are not filling up the mmap'ed buffers the upfront allocations in 
> constructor is cheap. This can avoid lock free allocation of arenas later in 
> the code. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23476) LLAP: Preallocate arenas for mmap case as well

2020-05-17 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23476:

Attachment: HIVE-23476.2.patch

> LLAP: Preallocate arenas for mmap case as well
> --
>
> Key: HIVE-23476
> URL: https://issues.apache.org/jira/browse/HIVE-23476
> Project: Hive
>  Issue Type: Bug
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-23476.1.patch, HIVE-23476.2.patch
>
>
> BuddyAllocator pre-allocation of arenas does not happen for mmap cache case. 
> Since we are not filling up the mmap'ed buffers the upfront allocations in 
> constructor is cheap. This can avoid lock free allocation of arenas later in 
> the code. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23476) LLAP: Preallocate arenas for mmap case as well

2020-05-17 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23476:

Status: Open  (was: Patch Available)

> LLAP: Preallocate arenas for mmap case as well
> --
>
> Key: HIVE-23476
> URL: https://issues.apache.org/jira/browse/HIVE-23476
> Project: Hive
>  Issue Type: Bug
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-23476.1.patch, HIVE-23476.2.patch
>
>
> BuddyAllocator pre-allocation of arenas does not happen for mmap cache case. 
> Since we are not filling up the mmap'ed buffers the upfront allocations in 
> constructor is cheap. This can avoid lock free allocation of arenas later in 
> the code. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23477) LLAP : mmap allocation interruptions fails to notify other threads

2020-05-17 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23477:

Status: Patch Available  (was: Open)

> LLAP : mmap allocation interruptions fails to notify other threads
> --
>
> Key: HIVE-23477
> URL: https://issues.apache.org/jira/browse/HIVE-23477
> Project: Hive
>  Issue Type: Bug
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23477.1.patch, HIVE-23477.2.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> BuddyAllocator always uses lazy allocation if mmap is enabled. If query 
> fragment is interrupted at the time of arena allocation, 
> ClosedByInterruptionException is thrown. This exception artificially triggers 
> allocator OutOfMemoryError and fails to notify other threads waiting to 
> allocate arenas. 
> {code:java}
> 2020-05-15 00:03:23.254  WARN [TezTR-128417_1_3_1_1_0] LlapIoImpl: Failed 
> trying to allocate memory mapped arena
> java.nio.channels.ClosedByInterruptException
> at 
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
> at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:970)
> at 
> org.apache.hadoop.hive.llap.cache.BuddyAllocator.preallocateArenaBuffer(BuddyAllocator.java:867)
> at 
> org.apache.hadoop.hive.llap.cache.BuddyAllocator.access$1100(BuddyAllocator.java:69)
> at 
> org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.init(BuddyAllocator.java:900)
> at 
> org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.allocateWithExpand(BuddyAllocator.java:1458)
> at 
> org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.access$800(BuddyAllocator.java:884)
> at 
> org.apache.hadoop.hive.llap.cache.BuddyAllocator.allocateWithExpand(BuddyAllocator.java:740)
> at 
> org.apache.hadoop.hive.llap.cache.BuddyAllocator.allocateMultiple(BuddyAllocator.java:330)
> at 
> org.apache.hadoop.hive.llap.io.metadata.MetadataCache.wrapBbForFile(MetadataCache.java:257)
> at 
> org.apache.hadoop.hive.llap.io.metadata.MetadataCache.putFileMetadata(MetadataCache.java:216)
> at 
> org.apache.hadoop.hive.llap.io.metadata.MetadataCache.putFileMetadata(MetadataCache.java:49)
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.readSplitFooter(VectorizedParquetRecordReader.java:343)
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.initialize(VectorizedParquetRecordReader.java:238)
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.(VectorizedParquetRecordReader.java:160)
> at 
> org.apache.hadoop.hive.ql.io.parquet.VectorizedParquetInputFormat.getRecordReader(VectorizedParquetInputFormat.java:50)
> at 
> org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:87)
> at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:427)
> at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:203)
> at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.(TezGroupedSplitsInputFormat.java:145)
> at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat.getRecordReader(TezGroupedSplitsInputFormat.java:111)
> at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.setupOldRecordReader(MRReaderMapred.java:156)
> at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.setSplit(MRReaderMapred.java:82)
> at 
> org.apache.tez.mapreduce.input.MRInput.initFromEventInternal(MRInput.java:703)
> at 
> org.apache.tez.mapreduce.input.MRInput.initFromEvent(MRInput.java:662)
> at 
> org.apache.tez.mapreduce.input.MRInputLegacy.checkAndAwaitRecordReaderInitialization(MRInputLegacy.java:150)
> at 
> org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:114)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getMRInput(MapRecordProcessor.java:532)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:178)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:266)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(T

[jira] [Updated] (HIVE-23477) LLAP : mmap allocation interruptions fails to notify other threads

2020-05-17 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23477:

Status: Open  (was: Patch Available)

> LLAP : mmap allocation interruptions fails to notify other threads
> --
>
> Key: HIVE-23477
> URL: https://issues.apache.org/jira/browse/HIVE-23477
> Project: Hive
>  Issue Type: Bug
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23477.1.patch, HIVE-23477.2.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> BuddyAllocator always uses lazy allocation if mmap is enabled. If query 
> fragment is interrupted at the time of arena allocation, 
> ClosedByInterruptionException is thrown. This exception artificially triggers 
> allocator OutOfMemoryError and fails to notify other threads waiting to 
> allocate arenas. 
> {code:java}
> 2020-05-15 00:03:23.254  WARN [TezTR-128417_1_3_1_1_0] LlapIoImpl: Failed 
> trying to allocate memory mapped arena
> java.nio.channels.ClosedByInterruptException
> at 
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
> at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:970)
> at 
> org.apache.hadoop.hive.llap.cache.BuddyAllocator.preallocateArenaBuffer(BuddyAllocator.java:867)
> at 
> org.apache.hadoop.hive.llap.cache.BuddyAllocator.access$1100(BuddyAllocator.java:69)
> at 
> org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.init(BuddyAllocator.java:900)
> at 
> org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.allocateWithExpand(BuddyAllocator.java:1458)
> at 
> org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.access$800(BuddyAllocator.java:884)
> at 
> org.apache.hadoop.hive.llap.cache.BuddyAllocator.allocateWithExpand(BuddyAllocator.java:740)
> at 
> org.apache.hadoop.hive.llap.cache.BuddyAllocator.allocateMultiple(BuddyAllocator.java:330)
> at 
> org.apache.hadoop.hive.llap.io.metadata.MetadataCache.wrapBbForFile(MetadataCache.java:257)
> at 
> org.apache.hadoop.hive.llap.io.metadata.MetadataCache.putFileMetadata(MetadataCache.java:216)
> at 
> org.apache.hadoop.hive.llap.io.metadata.MetadataCache.putFileMetadata(MetadataCache.java:49)
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.readSplitFooter(VectorizedParquetRecordReader.java:343)
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.initialize(VectorizedParquetRecordReader.java:238)
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.(VectorizedParquetRecordReader.java:160)
> at 
> org.apache.hadoop.hive.ql.io.parquet.VectorizedParquetInputFormat.getRecordReader(VectorizedParquetInputFormat.java:50)
> at 
> org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:87)
> at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:427)
> at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:203)
> at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.(TezGroupedSplitsInputFormat.java:145)
> at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat.getRecordReader(TezGroupedSplitsInputFormat.java:111)
> at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.setupOldRecordReader(MRReaderMapred.java:156)
> at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.setSplit(MRReaderMapred.java:82)
> at 
> org.apache.tez.mapreduce.input.MRInput.initFromEventInternal(MRInput.java:703)
> at 
> org.apache.tez.mapreduce.input.MRInput.initFromEvent(MRInput.java:662)
> at 
> org.apache.tez.mapreduce.input.MRInputLegacy.checkAndAwaitRecordReaderInitialization(MRInputLegacy.java:150)
> at 
> org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:114)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getMRInput(MapRecordProcessor.java:532)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:178)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:266)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(T

[jira] [Updated] (HIVE-23477) LLAP : mmap allocation interruptions fails to notify other threads

2020-05-17 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23477:

Attachment: HIVE-23477.2.patch

> LLAP : mmap allocation interruptions fails to notify other threads
> --
>
> Key: HIVE-23477
> URL: https://issues.apache.org/jira/browse/HIVE-23477
> Project: Hive
>  Issue Type: Bug
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23477.1.patch, HIVE-23477.2.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> BuddyAllocator always uses lazy allocation if mmap is enabled. If query 
> fragment is interrupted at the time of arena allocation, 
> ClosedByInterruptionException is thrown. This exception artificially triggers 
> allocator OutOfMemoryError and fails to notify other threads waiting to 
> allocate arenas. 
> {code:java}
> 2020-05-15 00:03:23.254  WARN [TezTR-128417_1_3_1_1_0] LlapIoImpl: Failed 
> trying to allocate memory mapped arena
> java.nio.channels.ClosedByInterruptException
> at 
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
> at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:970)
> at 
> org.apache.hadoop.hive.llap.cache.BuddyAllocator.preallocateArenaBuffer(BuddyAllocator.java:867)
> at 
> org.apache.hadoop.hive.llap.cache.BuddyAllocator.access$1100(BuddyAllocator.java:69)
> at 
> org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.init(BuddyAllocator.java:900)
> at 
> org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.allocateWithExpand(BuddyAllocator.java:1458)
> at 
> org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.access$800(BuddyAllocator.java:884)
> at 
> org.apache.hadoop.hive.llap.cache.BuddyAllocator.allocateWithExpand(BuddyAllocator.java:740)
> at 
> org.apache.hadoop.hive.llap.cache.BuddyAllocator.allocateMultiple(BuddyAllocator.java:330)
> at 
> org.apache.hadoop.hive.llap.io.metadata.MetadataCache.wrapBbForFile(MetadataCache.java:257)
> at 
> org.apache.hadoop.hive.llap.io.metadata.MetadataCache.putFileMetadata(MetadataCache.java:216)
> at 
> org.apache.hadoop.hive.llap.io.metadata.MetadataCache.putFileMetadata(MetadataCache.java:49)
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.readSplitFooter(VectorizedParquetRecordReader.java:343)
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.initialize(VectorizedParquetRecordReader.java:238)
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.(VectorizedParquetRecordReader.java:160)
> at 
> org.apache.hadoop.hive.ql.io.parquet.VectorizedParquetInputFormat.getRecordReader(VectorizedParquetInputFormat.java:50)
> at 
> org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:87)
> at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:427)
> at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:203)
> at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.(TezGroupedSplitsInputFormat.java:145)
> at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat.getRecordReader(TezGroupedSplitsInputFormat.java:111)
> at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.setupOldRecordReader(MRReaderMapred.java:156)
> at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.setSplit(MRReaderMapred.java:82)
> at 
> org.apache.tez.mapreduce.input.MRInput.initFromEventInternal(MRInput.java:703)
> at 
> org.apache.tez.mapreduce.input.MRInput.initFromEvent(MRInput.java:662)
> at 
> org.apache.tez.mapreduce.input.MRInputLegacy.checkAndAwaitRecordReaderInitialization(MRInputLegacy.java:150)
> at 
> org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:114)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getMRInput(MapRecordProcessor.java:532)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:178)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:266)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRun

[jira] [Updated] (HIVE-23292) Reduce PartitionDesc payload in MapWork

2020-05-17 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23292:

Status: Patch Available  (was: Open)

> Reduce PartitionDesc payload in MapWork
> ---
>
> Key: HIVE-23292
> URL: https://issues.apache.org/jira/browse/HIVE-23292
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-23292.1.patch, HIVE-23292.1.patch, 
> HIVE-23292.2.patch, HIVE-23292.3.patch, HIVE-23292.4.patch
>
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java#L105



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23292) Reduce PartitionDesc payload in MapWork

2020-05-17 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23292:

Status: Open  (was: Patch Available)

> Reduce PartitionDesc payload in MapWork
> ---
>
> Key: HIVE-23292
> URL: https://issues.apache.org/jira/browse/HIVE-23292
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-23292.1.patch, HIVE-23292.1.patch, 
> HIVE-23292.2.patch, HIVE-23292.3.patch, HIVE-23292.4.patch
>
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java#L105



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23292) Reduce PartitionDesc payload in MapWork

2020-05-17 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23292:

Attachment: HIVE-23292.4.patch

> Reduce PartitionDesc payload in MapWork
> ---
>
> Key: HIVE-23292
> URL: https://issues.apache.org/jira/browse/HIVE-23292
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-23292.1.patch, HIVE-23292.1.patch, 
> HIVE-23292.2.patch, HIVE-23292.3.patch, HIVE-23292.4.patch
>
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java#L105



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23292) Reduce PartitionDesc payload in MapWork

2020-05-16 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23292:

Status: Patch Available  (was: Open)

> Reduce PartitionDesc payload in MapWork
> ---
>
> Key: HIVE-23292
> URL: https://issues.apache.org/jira/browse/HIVE-23292
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-23292.1.patch, HIVE-23292.1.patch, 
> HIVE-23292.2.patch, HIVE-23292.3.patch
>
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java#L105



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23292) Reduce PartitionDesc payload in MapWork

2020-05-16 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23292:

Attachment: HIVE-23292.3.patch

> Reduce PartitionDesc payload in MapWork
> ---
>
> Key: HIVE-23292
> URL: https://issues.apache.org/jira/browse/HIVE-23292
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-23292.1.patch, HIVE-23292.1.patch, 
> HIVE-23292.2.patch, HIVE-23292.3.patch
>
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java#L105



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23292) Reduce PartitionDesc payload in MapWork

2020-05-16 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23292:

Status: Open  (was: Patch Available)

> Reduce PartitionDesc payload in MapWork
> ---
>
> Key: HIVE-23292
> URL: https://issues.apache.org/jira/browse/HIVE-23292
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-23292.1.patch, HIVE-23292.1.patch, 
> HIVE-23292.2.patch
>
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java#L105



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23376) Avoid repeated SHA computation in GenericUDTFGetSplits for hive-exec jar

2020-05-16 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23376:

Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Ramesh!

> Avoid repeated SHA computation in GenericUDTFGetSplits for hive-exec jar
> 
>
> Key: HIVE-23376
> URL: https://issues.apache.org/jira/browse/HIVE-23376
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-23376.2.patch, HIVE-23376.3.patch, 
> HIVE-23376.4.patch, HIVE-23376.5.patch, image-2020-05-06-16-37-48-615.png
>
>
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFGetSplits.java#L706]
>  
>  
> !image-2020-05-06-16-37-48-615.png|width=946,height=639!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23446) LLAP: Reduce IPC connection misses to AM for short queries

2020-05-16 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23446:

Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Rajesh!

> LLAP: Reduce IPC connection misses to AM for short queries
> --
>
> Key: HIVE-23446
> URL: https://issues.apache.org/jira/browse/HIVE-23446
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-23446.1.patch, HIVE-23446.2.patch, 
> HIVE-23446.3.patch, HIVE-23446.4.patch, HIVE-23446.5.patch
>
>
> [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/QueryInfo.java#L343]
>  
> Umbilical UGI pool for is maintained at QueryInfo level. When there are lots 
> of short queries, this ends up missing IPC cache and ends up recreating 
> threads/connections to the same AM.
> It would be good to maintain this pool in {{ContainerRunnerImpl}} instead and 
> recycle as needed.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23292) Reduce PartitionDesc payload in MapWork

2020-05-16 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23292:

Status: Open  (was: Patch Available)

> Reduce PartitionDesc payload in MapWork
> ---
>
> Key: HIVE-23292
> URL: https://issues.apache.org/jira/browse/HIVE-23292
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-23292.1.patch, HIVE-23292.1.patch, 
> HIVE-23292.2.patch
>
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java#L105



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23292) Reduce PartitionDesc payload in MapWork

2020-05-16 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23292:

Status: Patch Available  (was: Open)

> Reduce PartitionDesc payload in MapWork
> ---
>
> Key: HIVE-23292
> URL: https://issues.apache.org/jira/browse/HIVE-23292
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-23292.1.patch, HIVE-23292.1.patch, 
> HIVE-23292.2.patch
>
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java#L105



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23292) Reduce PartitionDesc payload in MapWork

2020-05-16 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23292:

Attachment: HIVE-23292.2.patch

> Reduce PartitionDesc payload in MapWork
> ---
>
> Key: HIVE-23292
> URL: https://issues.apache.org/jira/browse/HIVE-23292
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-23292.1.patch, HIVE-23292.1.patch, 
> HIVE-23292.2.patch
>
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java#L105



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23376) Avoid repeated SHA computation in GenericUDTFGetSplits for hive-exec jar

2020-05-16 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23376:

Status: Open  (was: Patch Available)

> Avoid repeated SHA computation in GenericUDTFGetSplits for hive-exec jar
> 
>
> Key: HIVE-23376
> URL: https://issues.apache.org/jira/browse/HIVE-23376
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-23376.2.patch, HIVE-23376.3.patch, 
> HIVE-23376.4.patch, HIVE-23376.5.patch, image-2020-05-06-16-37-48-615.png
>
>
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFGetSplits.java#L706]
>  
>  
> !image-2020-05-06-16-37-48-615.png|width=946,height=639!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23376) Avoid repeated SHA computation in GenericUDTFGetSplits for hive-exec jar

2020-05-16 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23376:

Attachment: HIVE-23376.5.patch

> Avoid repeated SHA computation in GenericUDTFGetSplits for hive-exec jar
> 
>
> Key: HIVE-23376
> URL: https://issues.apache.org/jira/browse/HIVE-23376
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-23376.2.patch, HIVE-23376.3.patch, 
> HIVE-23376.4.patch, HIVE-23376.5.patch, image-2020-05-06-16-37-48-615.png
>
>
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFGetSplits.java#L706]
>  
>  
> !image-2020-05-06-16-37-48-615.png|width=946,height=639!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23376) Avoid repeated SHA computation in GenericUDTFGetSplits for hive-exec jar

2020-05-16 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23376:

Status: Patch Available  (was: Open)

> Avoid repeated SHA computation in GenericUDTFGetSplits for hive-exec jar
> 
>
> Key: HIVE-23376
> URL: https://issues.apache.org/jira/browse/HIVE-23376
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-23376.2.patch, HIVE-23376.3.patch, 
> HIVE-23376.4.patch, HIVE-23376.5.patch, image-2020-05-06-16-37-48-615.png
>
>
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFGetSplits.java#L706]
>  
>  
> !image-2020-05-06-16-37-48-615.png|width=946,height=639!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23449) LLAP: Reduce mkdir and config creations in submitWork hotpath

2020-05-16 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23449:

Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Rajesh!

> LLAP: Reduce mkdir and config creations in submitWork hotpath
> -
>
> Key: HIVE-23449
> URL: https://issues.apache.org/jira/browse/HIVE-23449
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-23449.1.patch, HIVE-23449.2.patch, 
> HIVE-23449.3.patch, HIVE-23449.4.patch, HIVE-23449.5.patch, Screenshot 
> 2020-05-12 at 1.09.35 PM.png
>
>
> !Screenshot 2020-05-12 at 1.09.35 PM.png|width=885,height=558!
>  
> For short jobs, submitWork gets into hotpath. This can lazy load conf and can 
> get rid of dir creations (which needs to be enabled only when DirWatcher is 
> enabled)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23446) LLAP: Reduce IPC connection misses to AM for short queries

2020-05-16 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23446:

Status: Patch Available  (was: Open)

> LLAP: Reduce IPC connection misses to AM for short queries
> --
>
> Key: HIVE-23446
> URL: https://issues.apache.org/jira/browse/HIVE-23446
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Major
> Attachments: HIVE-23446.1.patch, HIVE-23446.2.patch, 
> HIVE-23446.3.patch, HIVE-23446.4.patch, HIVE-23446.5.patch
>
>
> [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/QueryInfo.java#L343]
>  
> Umbilical UGI pool for is maintained at QueryInfo level. When there are lots 
> of short queries, this ends up missing IPC cache and ends up recreating 
> threads/connections to the same AM.
> It would be good to maintain this pool in {{ContainerRunnerImpl}} instead and 
> recycle as needed.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23446) LLAP: Reduce IPC connection misses to AM for short queries

2020-05-16 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23446:

Attachment: HIVE-23446.5.patch

> LLAP: Reduce IPC connection misses to AM for short queries
> --
>
> Key: HIVE-23446
> URL: https://issues.apache.org/jira/browse/HIVE-23446
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Major
> Attachments: HIVE-23446.1.patch, HIVE-23446.2.patch, 
> HIVE-23446.3.patch, HIVE-23446.4.patch, HIVE-23446.5.patch
>
>
> [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/QueryInfo.java#L343]
>  
> Umbilical UGI pool for is maintained at QueryInfo level. When there are lots 
> of short queries, this ends up missing IPC cache and ends up recreating 
> threads/connections to the same AM.
> It would be good to maintain this pool in {{ContainerRunnerImpl}} instead and 
> recycle as needed.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23446) LLAP: Reduce IPC connection misses to AM for short queries

2020-05-16 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23446:

Status: Open  (was: Patch Available)

> LLAP: Reduce IPC connection misses to AM for short queries
> --
>
> Key: HIVE-23446
> URL: https://issues.apache.org/jira/browse/HIVE-23446
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Major
> Attachments: HIVE-23446.1.patch, HIVE-23446.2.patch, 
> HIVE-23446.3.patch, HIVE-23446.4.patch
>
>
> [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/QueryInfo.java#L343]
>  
> Umbilical UGI pool for is maintained at QueryInfo level. When there are lots 
> of short queries, this ends up missing IPC cache and ends up recreating 
> threads/connections to the same AM.
> It would be good to maintain this pool in {{ContainerRunnerImpl}} instead and 
> recycle as needed.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23292) Reduce PartitionDesc payload in MapWork

2020-05-16 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23292:

Status: Patch Available  (was: Open)

> Reduce PartitionDesc payload in MapWork
> ---
>
> Key: HIVE-23292
> URL: https://issues.apache.org/jira/browse/HIVE-23292
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-23292.1.patch, HIVE-23292.1.patch
>
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java#L105



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23292) Reduce PartitionDesc payload in MapWork

2020-05-16 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17108952#comment-17108952
 ] 

Ashutosh Chauhan commented on HIVE-23292:
-

[~rajesh.balamohan] i have reworked the patch to instead removing the params to 
not add those params in first place. Also, removing few additional unneeded 
params. Would you like to review at https://reviews.apache.org/r/72519/

> Reduce PartitionDesc payload in MapWork
> ---
>
> Key: HIVE-23292
> URL: https://issues.apache.org/jira/browse/HIVE-23292
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-23292.1.patch, HIVE-23292.1.patch
>
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java#L105



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23292) Reduce PartitionDesc payload in MapWork

2020-05-16 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23292:

Attachment: (was: HIVE-22737.1.patch)

> Reduce PartitionDesc payload in MapWork
> ---
>
> Key: HIVE-23292
> URL: https://issues.apache.org/jira/browse/HIVE-23292
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-23292.1.patch, HIVE-23292.1.patch
>
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java#L105



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23292) Reduce PartitionDesc payload in MapWork

2020-05-16 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23292:

Attachment: HIVE-23292.1.patch

> Reduce PartitionDesc payload in MapWork
> ---
>
> Key: HIVE-23292
> URL: https://issues.apache.org/jira/browse/HIVE-23292
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-23292.1.patch, HIVE-23292.1.patch
>
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java#L105



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23292) Reduce PartitionDesc payload in MapWork

2020-05-16 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23292:

Status: Open  (was: Patch Available)

> Reduce PartitionDesc payload in MapWork
> ---
>
> Key: HIVE-23292
> URL: https://issues.apache.org/jira/browse/HIVE-23292
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-23292.1.patch, HIVE-23292.1.patch
>
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java#L105



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23292) Reduce PartitionDesc payload in MapWork

2020-05-16 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23292:

Attachment: HIVE-22737.1.patch

> Reduce PartitionDesc payload in MapWork
> ---
>
> Key: HIVE-23292
> URL: https://issues.apache.org/jira/browse/HIVE-23292
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-22737.1.patch, HIVE-23292.1.patch
>
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java#L105



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23292) Reduce PartitionDesc payload in MapWork

2020-05-16 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23292:

Status: Patch Available  (was: Open)

> Reduce PartitionDesc payload in MapWork
> ---
>
> Key: HIVE-23292
> URL: https://issues.apache.org/jira/browse/HIVE-23292
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-22737.1.patch, HIVE-23292.1.patch
>
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java#L105



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23292) Reduce PartitionDesc payload in MapWork

2020-05-16 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23292:

Status: Open  (was: Patch Available)

> Reduce PartitionDesc payload in MapWork
> ---
>
> Key: HIVE-23292
> URL: https://issues.apache.org/jira/browse/HIVE-23292
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-23292.1.patch
>
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java#L105



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23375) Track MJ HashTable Load time

2020-05-16 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23375:

Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Panos!

> Track MJ HashTable Load time
> 
>
> Key: HIVE-23375
> URL: https://issues.apache.org/jira/browse/HIVE-23375
> Project: Hive
>  Issue Type: Improvement
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-23375.01.patch, HIVE-23375.02.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Introduce TezCounter to track MJ HashTable Load time



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23376) Avoid repeated SHA computation in GenericUDTFGetSplits for hive-exec jar

2020-05-16 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23376:

Status: Open  (was: Patch Available)

> Avoid repeated SHA computation in GenericUDTFGetSplits for hive-exec jar
> 
>
> Key: HIVE-23376
> URL: https://issues.apache.org/jira/browse/HIVE-23376
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-23376.2.patch, HIVE-23376.3.patch, 
> HIVE-23376.4.patch, image-2020-05-06-16-37-48-615.png
>
>
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFGetSplits.java#L706]
>  
>  
> !image-2020-05-06-16-37-48-615.png|width=946,height=639!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23376) Avoid repeated SHA computation in GenericUDTFGetSplits for hive-exec jar

2020-05-16 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23376:

Status: Patch Available  (was: Open)

> Avoid repeated SHA computation in GenericUDTFGetSplits for hive-exec jar
> 
>
> Key: HIVE-23376
> URL: https://issues.apache.org/jira/browse/HIVE-23376
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-23376.2.patch, HIVE-23376.3.patch, 
> HIVE-23376.4.patch, image-2020-05-06-16-37-48-615.png
>
>
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFGetSplits.java#L706]
>  
>  
> !image-2020-05-06-16-37-48-615.png|width=946,height=639!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23376) Avoid repeated SHA computation in GenericUDTFGetSplits for hive-exec jar

2020-05-16 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23376:

Attachment: HIVE-23376.4.patch

> Avoid repeated SHA computation in GenericUDTFGetSplits for hive-exec jar
> 
>
> Key: HIVE-23376
> URL: https://issues.apache.org/jira/browse/HIVE-23376
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-23376.2.patch, HIVE-23376.3.patch, 
> HIVE-23376.4.patch, image-2020-05-06-16-37-48-615.png
>
>
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFGetSplits.java#L706]
>  
>  
> !image-2020-05-06-16-37-48-615.png|width=946,height=639!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23446) LLAP: Reduce IPC connection misses to AM for short queries

2020-05-16 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23446:

Status: Open  (was: Patch Available)

> LLAP: Reduce IPC connection misses to AM for short queries
> --
>
> Key: HIVE-23446
> URL: https://issues.apache.org/jira/browse/HIVE-23446
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Major
> Attachments: HIVE-23446.1.patch, HIVE-23446.2.patch, 
> HIVE-23446.3.patch, HIVE-23446.4.patch
>
>
> [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/QueryInfo.java#L343]
>  
> Umbilical UGI pool for is maintained at QueryInfo level. When there are lots 
> of short queries, this ends up missing IPC cache and ends up recreating 
> threads/connections to the same AM.
> It would be good to maintain this pool in {{ContainerRunnerImpl}} instead and 
> recycle as needed.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23446) LLAP: Reduce IPC connection misses to AM for short queries

2020-05-16 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23446:

Attachment: HIVE-23446.4.patch

> LLAP: Reduce IPC connection misses to AM for short queries
> --
>
> Key: HIVE-23446
> URL: https://issues.apache.org/jira/browse/HIVE-23446
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Major
> Attachments: HIVE-23446.1.patch, HIVE-23446.2.patch, 
> HIVE-23446.3.patch, HIVE-23446.4.patch
>
>
> [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/QueryInfo.java#L343]
>  
> Umbilical UGI pool for is maintained at QueryInfo level. When there are lots 
> of short queries, this ends up missing IPC cache and ends up recreating 
> threads/connections to the same AM.
> It would be good to maintain this pool in {{ContainerRunnerImpl}} instead and 
> recycle as needed.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23446) LLAP: Reduce IPC connection misses to AM for short queries

2020-05-16 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23446:

Status: Patch Available  (was: Open)

> LLAP: Reduce IPC connection misses to AM for short queries
> --
>
> Key: HIVE-23446
> URL: https://issues.apache.org/jira/browse/HIVE-23446
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Major
> Attachments: HIVE-23446.1.patch, HIVE-23446.2.patch, 
> HIVE-23446.3.patch, HIVE-23446.4.patch
>
>
> [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/QueryInfo.java#L343]
>  
> Umbilical UGI pool for is maintained at QueryInfo level. When there are lots 
> of short queries, this ends up missing IPC cache and ends up recreating 
> threads/connections to the same AM.
> It would be good to maintain this pool in {{ContainerRunnerImpl}} instead and 
> recycle as needed.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23449) LLAP: Reduce mkdir and config creations in submitWork hotpath

2020-05-16 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23449:

Status: Patch Available  (was: Open)

> LLAP: Reduce mkdir and config creations in submitWork hotpath
> -
>
> Key: HIVE-23449
> URL: https://issues.apache.org/jira/browse/HIVE-23449
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Major
> Attachments: HIVE-23449.1.patch, HIVE-23449.2.patch, 
> HIVE-23449.3.patch, HIVE-23449.4.patch, HIVE-23449.5.patch, Screenshot 
> 2020-05-12 at 1.09.35 PM.png
>
>
> !Screenshot 2020-05-12 at 1.09.35 PM.png|width=885,height=558!
>  
> For short jobs, submitWork gets into hotpath. This can lazy load conf and can 
> get rid of dir creations (which needs to be enabled only when DirWatcher is 
> enabled)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23449) LLAP: Reduce mkdir and config creations in submitWork hotpath

2020-05-16 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23449:

Status: Open  (was: Patch Available)

> LLAP: Reduce mkdir and config creations in submitWork hotpath
> -
>
> Key: HIVE-23449
> URL: https://issues.apache.org/jira/browse/HIVE-23449
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Major
> Attachments: HIVE-23449.1.patch, HIVE-23449.2.patch, 
> HIVE-23449.3.patch, HIVE-23449.4.patch, HIVE-23449.5.patch, Screenshot 
> 2020-05-12 at 1.09.35 PM.png
>
>
> !Screenshot 2020-05-12 at 1.09.35 PM.png|width=885,height=558!
>  
> For short jobs, submitWork gets into hotpath. This can lazy load conf and can 
> get rid of dir creations (which needs to be enabled only when DirWatcher is 
> enabled)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23449) LLAP: Reduce mkdir and config creations in submitWork hotpath

2020-05-16 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23449:

Attachment: HIVE-23449.5.patch

> LLAP: Reduce mkdir and config creations in submitWork hotpath
> -
>
> Key: HIVE-23449
> URL: https://issues.apache.org/jira/browse/HIVE-23449
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Major
> Attachments: HIVE-23449.1.patch, HIVE-23449.2.patch, 
> HIVE-23449.3.patch, HIVE-23449.4.patch, HIVE-23449.5.patch, Screenshot 
> 2020-05-12 at 1.09.35 PM.png
>
>
> !Screenshot 2020-05-12 at 1.09.35 PM.png|width=885,height=558!
>  
> For short jobs, submitWork gets into hotpath. This can lazy load conf and can 
> get rid of dir creations (which needs to be enabled only when DirWatcher is 
> enabled)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23446) LLAP: Reduce IPC connection misses to AM for short queries

2020-05-13 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17106925#comment-17106925
 ] 

Ashutosh Chauhan commented on HIVE-23446:
-

+1 pending tests.

> LLAP: Reduce IPC connection misses to AM for short queries
> --
>
> Key: HIVE-23446
> URL: https://issues.apache.org/jira/browse/HIVE-23446
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Major
> Attachments: HIVE-23446.1.patch, HIVE-23446.2.patch
>
>
> [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/QueryInfo.java#L343]
>  
> Umbilical UGI pool for is maintained at QueryInfo level. When there are lots 
> of short queries, this ends up missing IPC cache and ends up recreating 
> threads/connections to the same AM.
> It would be good to maintain this pool in {{ContainerRunnerImpl}} instead and 
> recycle as needed.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23423) Check of disabling hash aggregation ignores grouping set

2020-05-13 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23423:

Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Pushed to master with updated title.
Thanks, Gopal!

> Check of disabling hash aggregation ignores grouping set
> 
>
> Key: HIVE-23423
> URL: https://issues.apache.org/jira/browse/HIVE-23423
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Operators, Query Processor
>Affects Versions: 4.0.0
>Reporter: Nita Dembla
>Assignee: Gopal Vijayaraghavan
>Priority: Major
>  Labels: Performance
> Fix For: 4.0.0
>
> Attachments: HIVE-23423.1.patch, HIVE-23423.WIP.patch
>
>
> https://issues.apache.org/jira/browse/HIVE-23356 fixed the issue with 
> disabling hash aggregation on grouping set queries. Need a fix for 
> VectorGroupbyOperator operator.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23423) Check of disabling hash aggregation ignores grouping set

2020-05-13 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23423:

Summary: Check of disabling hash aggregation ignores grouping set  (was: 
Hash aggregation is always disabled in vectorized execution of grouping set 
queries)

> Check of disabling hash aggregation ignores grouping set
> 
>
> Key: HIVE-23423
> URL: https://issues.apache.org/jira/browse/HIVE-23423
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Operators, Query Processor
>Affects Versions: 4.0.0
>Reporter: Nita Dembla
>Assignee: Gopal Vijayaraghavan
>Priority: Major
>  Labels: Performance
> Attachments: HIVE-23423.1.patch, HIVE-23423.WIP.patch
>
>
> https://issues.apache.org/jira/browse/HIVE-23356 fixed the issue with 
> disabling hash aggregation on grouping set queries. Need a fix for 
> VectorGroupbyOperator operator.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23423) Hash aggregation is always disabled in vectorized execution of grouping set queries

2020-05-13 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17106886#comment-17106886
 ] 

Ashutosh Chauhan commented on HIVE-23423:
-

+1

> Hash aggregation is always disabled in vectorized execution of grouping set 
> queries
> ---
>
> Key: HIVE-23423
> URL: https://issues.apache.org/jira/browse/HIVE-23423
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Operators, Query Processor
>Affects Versions: 4.0.0
>Reporter: Nita Dembla
>Assignee: Gopal Vijayaraghavan
>Priority: Major
>  Labels: Performance
> Attachments: HIVE-23423.1.patch, HIVE-23423.WIP.patch
>
>
> https://issues.apache.org/jira/browse/HIVE-23356 fixed the issue with 
> disabling hash aggregation on grouping set queries. Need a fix for 
> VectorGroupbyOperator operator.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23451) FileSinkOperator calls deleteOnExit (hdfs call) twice for the same file

2020-05-13 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-23451:

Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Rajesh!

> FileSinkOperator calls deleteOnExit (hdfs call) twice for the same file
> ---
>
> Key: HIVE-23451
> URL: https://issues.apache.org/jira/browse/HIVE-23451
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Fix For: 4.0.0
>
> Attachments: HIVE-23451.1.patch, HIVE-23451.2.patch, 
> HIVE-23451.3.patch
>
>
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java#L826]
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java#L797]
> Can avoid a NN call here (i.e, mainly for small queries).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

< 1 2 3 4 5 6 7 8 9 10 >

101 - 200 of 5716 matches

Mail list logo