[jira] [Work logged] (HIVE-23822) Sorted dynamic partition optimization could remove auto stat task

2020-07-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23822?focusedWorklogId=459044=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-459044
 ]

ASF GitHub Bot logged work on HIVE-23822:
-

Author: ASF GitHub Bot
Created on: 15/Jul/20 00:56
Start Date: 15/Jul/20 00:56
Worklog Time Spent: 10m 
  Work Description: vineetgarg02 merged pull request #1231:
URL: https://github.com/apache/hive/pull/1231


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 459044)
Time Spent: 1.5h  (was: 1h 20m)

> Sorted dynamic partition optimization could remove auto stat task
> -
>
> Key: HIVE-23822
> URL: https://issues.apache.org/jira/browse/HIVE-23822
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> {{mm_dp}} has reproducer where INSERT query is missing auto stats task.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23822) Sorted dynamic partition optimization could remove auto stat task

2020-07-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23822?focusedWorklogId=458955=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-458955
 ]

ASF GitHub Bot logged work on HIVE-23822:
-

Author: ASF GitHub Bot
Created on: 14/Jul/20 21:06
Start Date: 14/Jul/20 21:06
Worklog Time Spent: 10m 
  Work Description: vineetgarg02 commented on a change in pull request 
#1231:
URL: https://github.com/apache/hive/pull/1231#discussion_r454645150



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOptimizer.java
##
@@ -409,26 +409,54 @@ private boolean 
removeRSInsertedByEnforceBucketing(FileSinkOperator fsOp) {
   // and grand child
   if (found) {
 Operator rsParent = 
rsToRemove.getParentOperators().get(0);
-Operator rsChild = 
rsToRemove.getChildOperators().get(0);
-Operator rsGrandChild = 
rsChild.getChildOperators().get(0);
-
-if (rsChild instanceof SelectOperator) {
-  // if schema size cannot be matched, then it could be because of 
constant folding
-  // converting partition column expression to constant expression. 
The constant
-  // expression will then get pruned by column pruner since it will 
not reference to
-  // any columns.
-  if (rsParent.getSchema().getSignature().size() !=
-  rsChild.getSchema().getSignature().size()) {
+List> rsChildren = 
rsToRemove.getChildOperators();
+
+Operator rsChildToRemove = null;
+
+for (Operator rsChild : rsChildren) {

Review comment:
   @jcamachor all tests passed.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 458955)
Time Spent: 1h 20m  (was: 1h 10m)

> Sorted dynamic partition optimization could remove auto stat task
> -
>
> Key: HIVE-23822
> URL: https://issues.apache.org/jira/browse/HIVE-23822
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> {{mm_dp}} has reproducer where INSERT query is missing auto stats task.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23822) Sorted dynamic partition optimization could remove auto stat task

2020-07-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23822?focusedWorklogId=458798=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-458798
 ]

ASF GitHub Bot logged work on HIVE-23822:
-

Author: ASF GitHub Bot
Created on: 14/Jul/20 17:11
Start Date: 14/Jul/20 17:11
Worklog Time Spent: 10m 
  Work Description: vineetgarg02 commented on a change in pull request 
#1231:
URL: https://github.com/apache/hive/pull/1231#discussion_r454511428



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOptimizer.java
##
@@ -409,26 +409,54 @@ private boolean 
removeRSInsertedByEnforceBucketing(FileSinkOperator fsOp) {
   // and grand child
   if (found) {
 Operator rsParent = 
rsToRemove.getParentOperators().get(0);
-Operator rsChild = 
rsToRemove.getChildOperators().get(0);
-Operator rsGrandChild = 
rsChild.getChildOperators().get(0);
-
-if (rsChild instanceof SelectOperator) {
-  // if schema size cannot be matched, then it could be because of 
constant folding
-  // converting partition column expression to constant expression. 
The constant
-  // expression will then get pruned by column pruner since it will 
not reference to
-  // any columns.
-  if (rsParent.getSchema().getSignature().size() !=
-  rsChild.getSchema().getSignature().size()) {
+List> rsChildren = 
rsToRemove.getChildOperators();
+
+Operator rsChildToRemove = null;
+
+for (Operator rsChild : rsChildren) {

Review comment:
   @jcamachor I have addressed in latest commit.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 458798)
Time Spent: 1h 10m  (was: 1h)

> Sorted dynamic partition optimization could remove auto stat task
> -
>
> Key: HIVE-23822
> URL: https://issues.apache.org/jira/browse/HIVE-23822
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> {{mm_dp}} has reproducer where INSERT query is missing auto stats task.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23822) Sorted dynamic partition optimization could remove auto stat task

2020-07-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23822?focusedWorklogId=458797=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-458797
 ]

ASF GitHub Bot logged work on HIVE-23822:
-

Author: ASF GitHub Bot
Created on: 14/Jul/20 17:10
Start Date: 14/Jul/20 17:10
Worklog Time Spent: 10m 
  Work Description: vineetgarg02 commented on a change in pull request 
#1231:
URL: https://github.com/apache/hive/pull/1231#discussion_r454511428



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOptimizer.java
##
@@ -409,26 +409,54 @@ private boolean 
removeRSInsertedByEnforceBucketing(FileSinkOperator fsOp) {
   // and grand child
   if (found) {
 Operator rsParent = 
rsToRemove.getParentOperators().get(0);
-Operator rsChild = 
rsToRemove.getChildOperators().get(0);
-Operator rsGrandChild = 
rsChild.getChildOperators().get(0);
-
-if (rsChild instanceof SelectOperator) {
-  // if schema size cannot be matched, then it could be because of 
constant folding
-  // converting partition column expression to constant expression. 
The constant
-  // expression will then get pruned by column pruner since it will 
not reference to
-  // any columns.
-  if (rsParent.getSchema().getSignature().size() !=
-  rsChild.getSchema().getSignature().size()) {
+List> rsChildren = 
rsToRemove.getChildOperators();
+
+Operator rsChildToRemove = null;
+
+for (Operator rsChild : rsChildren) {

Review comment:
   @jcamachor I have addressed in latest comment.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 458797)
Time Spent: 1h  (was: 50m)

> Sorted dynamic partition optimization could remove auto stat task
> -
>
> Key: HIVE-23822
> URL: https://issues.apache.org/jira/browse/HIVE-23822
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> {{mm_dp}} has reproducer where INSERT query is missing auto stats task.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23822) Sorted dynamic partition optimization could remove auto stat task

2020-07-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23822?focusedWorklogId=458743=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-458743
 ]

ASF GitHub Bot logged work on HIVE-23822:
-

Author: ASF GitHub Bot
Created on: 14/Jul/20 15:58
Start Date: 14/Jul/20 15:58
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on a change in pull request #1231:
URL: https://github.com/apache/hive/pull/1231#discussion_r454464296



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOptimizer.java
##
@@ -409,26 +409,54 @@ private boolean 
removeRSInsertedByEnforceBucketing(FileSinkOperator fsOp) {
   // and grand child
   if (found) {
 Operator rsParent = 
rsToRemove.getParentOperators().get(0);
-Operator rsChild = 
rsToRemove.getChildOperators().get(0);
-Operator rsGrandChild = 
rsChild.getChildOperators().get(0);
-
-if (rsChild instanceof SelectOperator) {
-  // if schema size cannot be matched, then it could be because of 
constant folding
-  // converting partition column expression to constant expression. 
The constant
-  // expression will then get pruned by column pruner since it will 
not reference to
-  // any columns.
-  if (rsParent.getSchema().getSignature().size() !=
-  rsChild.getSchema().getSignature().size()) {
+List> rsChildren = 
rsToRemove.getChildOperators();
+
+Operator rsChildToRemove = null;
+
+for (Operator rsChild : rsChildren) {

Review comment:
   Yes, there should not be a RS with multiple children, we can simplify 
that code. You can even add an assert to the new code to make sure.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 458743)
Time Spent: 50m  (was: 40m)

> Sorted dynamic partition optimization could remove auto stat task
> -
>
> Key: HIVE-23822
> URL: https://issues.apache.org/jira/browse/HIVE-23822
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> {{mm_dp}} has reproducer where INSERT query is missing auto stats task.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23822) Sorted dynamic partition optimization could remove auto stat task

2020-07-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23822?focusedWorklogId=458740=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-458740
 ]

ASF GitHub Bot logged work on HIVE-23822:
-

Author: ASF GitHub Bot
Created on: 14/Jul/20 15:55
Start Date: 14/Jul/20 15:55
Worklog Time Spent: 10m 
  Work Description: vineetgarg02 commented on a change in pull request 
#1231:
URL: https://github.com/apache/hive/pull/1231#discussion_r454462383



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOptimizer.java
##
@@ -409,26 +409,54 @@ private boolean 
removeRSInsertedByEnforceBucketing(FileSinkOperator fsOp) {
   // and grand child
   if (found) {
 Operator rsParent = 
rsToRemove.getParentOperators().get(0);
-Operator rsChild = 
rsToRemove.getChildOperators().get(0);
-Operator rsGrandChild = 
rsChild.getChildOperators().get(0);
-
-if (rsChild instanceof SelectOperator) {
-  // if schema size cannot be matched, then it could be because of 
constant folding
-  // converting partition column expression to constant expression. 
The constant
-  // expression will then get pruned by column pruner since it will 
not reference to
-  // any columns.
-  if (rsParent.getSchema().getSignature().size() !=
-  rsChild.getSchema().getSignature().size()) {
+List> rsChildren = 
rsToRemove.getChildOperators();
+
+Operator rsChildToRemove = null;
+
+for (Operator rsChild : rsChildren) {

Review comment:
   I assumed that this could be possibility and therefore accounted for it, 
but if this assumption is wrong I update the code.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 458740)
Time Spent: 40m  (was: 0.5h)

> Sorted dynamic partition optimization could remove auto stat task
> -
>
> Key: HIVE-23822
> URL: https://issues.apache.org/jira/browse/HIVE-23822
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {{mm_dp}} has reproducer where INSERT query is missing auto stats task.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23822) Sorted dynamic partition optimization could remove auto stat task

2020-07-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23822?focusedWorklogId=458449=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-458449
 ]

ASF GitHub Bot logged work on HIVE-23822:
-

Author: ASF GitHub Bot
Created on: 14/Jul/20 04:28
Start Date: 14/Jul/20 04:28
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on a change in pull request #1231:
URL: https://github.com/apache/hive/pull/1231#discussion_r454092186



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOptimizer.java
##
@@ -409,26 +409,54 @@ private boolean 
removeRSInsertedByEnforceBucketing(FileSinkOperator fsOp) {
   // and grand child
   if (found) {
 Operator rsParent = 
rsToRemove.getParentOperators().get(0);
-Operator rsChild = 
rsToRemove.getChildOperators().get(0);
-Operator rsGrandChild = 
rsChild.getChildOperators().get(0);
-
-if (rsChild instanceof SelectOperator) {
-  // if schema size cannot be matched, then it could be because of 
constant folding
-  // converting partition column expression to constant expression. 
The constant
-  // expression will then get pruned by column pruner since it will 
not reference to
-  // any columns.
-  if (rsParent.getSchema().getSignature().size() !=
-  rsChild.getSchema().getSignature().size()) {
+List> rsChildren = 
rsToRemove.getChildOperators();
+
+Operator rsChildToRemove = null;
+
+for (Operator rsChild : rsChildren) {

Review comment:
   In which case would we have a RS with multiple children? I thought this 
would never happen. Can we leave a comment explaining it? Otherwise, we should 
add a Precondition with number of children 1.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 458449)
Time Spent: 0.5h  (was: 20m)

> Sorted dynamic partition optimization could remove auto stat task
> -
>
> Key: HIVE-23822
> URL: https://issues.apache.org/jira/browse/HIVE-23822
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> {{mm_dp}} has reproducer where INSERT query is missing auto stats task.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23822) Sorted dynamic partition optimization could remove auto stat task

2020-07-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23822?focusedWorklogId=458448=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-458448
 ]

ASF GitHub Bot logged work on HIVE-23822:
-

Author: ASF GitHub Bot
Created on: 14/Jul/20 04:27
Start Date: 14/Jul/20 04:27
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on a change in pull request #1231:
URL: https://github.com/apache/hive/pull/1231#discussion_r454092186



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOptimizer.java
##
@@ -409,26 +409,54 @@ private boolean 
removeRSInsertedByEnforceBucketing(FileSinkOperator fsOp) {
   // and grand child
   if (found) {
 Operator rsParent = 
rsToRemove.getParentOperators().get(0);
-Operator rsChild = 
rsToRemove.getChildOperators().get(0);
-Operator rsGrandChild = 
rsChild.getChildOperators().get(0);
-
-if (rsChild instanceof SelectOperator) {
-  // if schema size cannot be matched, then it could be because of 
constant folding
-  // converting partition column expression to constant expression. 
The constant
-  // expression will then get pruned by column pruner since it will 
not reference to
-  // any columns.
-  if (rsParent.getSchema().getSignature().size() !=
-  rsChild.getSchema().getSignature().size()) {
+List> rsChildren = 
rsToRemove.getChildOperators();
+
+Operator rsChildToRemove = null;
+
+for (Operator rsChild : rsChildren) {

Review comment:
   In which case would we have a RS with multiple children? Can we leave a 
comment explaining it? Otherwise, we should add a Precondition with number of 
children 1.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 458448)
Time Spent: 20m  (was: 10m)

> Sorted dynamic partition optimization could remove auto stat task
> -
>
> Key: HIVE-23822
> URL: https://issues.apache.org/jira/browse/HIVE-23822
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {{mm_dp}} has reproducer where INSERT query is missing auto stats task.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23822) Sorted dynamic partition optimization could remove auto stat task

2020-07-08 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23822?focusedWorklogId=456363=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-456363
 ]

ASF GitHub Bot logged work on HIVE-23822:
-

Author: ASF GitHub Bot
Created on: 08/Jul/20 23:00
Start Date: 08/Jul/20 23:00
Worklog Time Spent: 10m 
  Work Description: vineetgarg02 opened a new pull request #1231:
URL: https://github.com/apache/hive/pull/1231


   …at task
   
   ## NOTICE
   
   Please create an issue in ASF JIRA before opening a pull request,
   and you need to set the title of the pull request which starts with
   the corresponding JIRA issue number. (e.g. HIVE-X: Fix a typo in YYY)
   For more details, please see 
https://cwiki.apache.org/confluence/display/Hive/HowToContribute
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 456363)
Remaining Estimate: 0h
Time Spent: 10m

> Sorted dynamic partition optimization could remove auto stat task
> -
>
> Key: HIVE-23822
> URL: https://issues.apache.org/jira/browse/HIVE-23822
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {{mm_dp}} has reproducer where INSERT query is missing auto stats task.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)