[jira] [Work logged] (HIVE-23822) Sorted dynamic partition optimization could remove auto stat task
[ https://issues.apache.org/jira/browse/HIVE-23822?focusedWorklogId=459044=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-459044 ] ASF GitHub Bot logged work on HIVE-23822: - Author: ASF GitHub Bot Created on: 15/Jul/20 00:56 Start Date: 15/Jul/20 00:56 Worklog Time Spent: 10m Work Description: vineetgarg02 merged pull request #1231: URL: https://github.com/apache/hive/pull/1231 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 459044) Time Spent: 1.5h (was: 1h 20m) > Sorted dynamic partition optimization could remove auto stat task > - > > Key: HIVE-23822 > URL: https://issues.apache.org/jira/browse/HIVE-23822 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > {{mm_dp}} has reproducer where INSERT query is missing auto stats task. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23822) Sorted dynamic partition optimization could remove auto stat task
[ https://issues.apache.org/jira/browse/HIVE-23822?focusedWorklogId=458955=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-458955 ] ASF GitHub Bot logged work on HIVE-23822: - Author: ASF GitHub Bot Created on: 14/Jul/20 21:06 Start Date: 14/Jul/20 21:06 Worklog Time Spent: 10m Work Description: vineetgarg02 commented on a change in pull request #1231: URL: https://github.com/apache/hive/pull/1231#discussion_r454645150 ## File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOptimizer.java ## @@ -409,26 +409,54 @@ private boolean removeRSInsertedByEnforceBucketing(FileSinkOperator fsOp) { // and grand child if (found) { Operator rsParent = rsToRemove.getParentOperators().get(0); -Operator rsChild = rsToRemove.getChildOperators().get(0); -Operator rsGrandChild = rsChild.getChildOperators().get(0); - -if (rsChild instanceof SelectOperator) { - // if schema size cannot be matched, then it could be because of constant folding - // converting partition column expression to constant expression. The constant - // expression will then get pruned by column pruner since it will not reference to - // any columns. - if (rsParent.getSchema().getSignature().size() != - rsChild.getSchema().getSignature().size()) { +List> rsChildren = rsToRemove.getChildOperators(); + +Operator rsChildToRemove = null; + +for (Operator rsChild : rsChildren) { Review comment: @jcamachor all tests passed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 458955) Time Spent: 1h 20m (was: 1h 10m) > Sorted dynamic partition optimization could remove auto stat task > - > > Key: HIVE-23822 > URL: https://issues.apache.org/jira/browse/HIVE-23822 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > {{mm_dp}} has reproducer where INSERT query is missing auto stats task. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23822) Sorted dynamic partition optimization could remove auto stat task
[ https://issues.apache.org/jira/browse/HIVE-23822?focusedWorklogId=458798=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-458798 ] ASF GitHub Bot logged work on HIVE-23822: - Author: ASF GitHub Bot Created on: 14/Jul/20 17:11 Start Date: 14/Jul/20 17:11 Worklog Time Spent: 10m Work Description: vineetgarg02 commented on a change in pull request #1231: URL: https://github.com/apache/hive/pull/1231#discussion_r454511428 ## File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOptimizer.java ## @@ -409,26 +409,54 @@ private boolean removeRSInsertedByEnforceBucketing(FileSinkOperator fsOp) { // and grand child if (found) { Operator rsParent = rsToRemove.getParentOperators().get(0); -Operator rsChild = rsToRemove.getChildOperators().get(0); -Operator rsGrandChild = rsChild.getChildOperators().get(0); - -if (rsChild instanceof SelectOperator) { - // if schema size cannot be matched, then it could be because of constant folding - // converting partition column expression to constant expression. The constant - // expression will then get pruned by column pruner since it will not reference to - // any columns. - if (rsParent.getSchema().getSignature().size() != - rsChild.getSchema().getSignature().size()) { +List> rsChildren = rsToRemove.getChildOperators(); + +Operator rsChildToRemove = null; + +for (Operator rsChild : rsChildren) { Review comment: @jcamachor I have addressed in latest commit. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 458798) Time Spent: 1h 10m (was: 1h) > Sorted dynamic partition optimization could remove auto stat task > - > > Key: HIVE-23822 > URL: https://issues.apache.org/jira/browse/HIVE-23822 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > {{mm_dp}} has reproducer where INSERT query is missing auto stats task. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23822) Sorted dynamic partition optimization could remove auto stat task
[ https://issues.apache.org/jira/browse/HIVE-23822?focusedWorklogId=458797=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-458797 ] ASF GitHub Bot logged work on HIVE-23822: - Author: ASF GitHub Bot Created on: 14/Jul/20 17:10 Start Date: 14/Jul/20 17:10 Worklog Time Spent: 10m Work Description: vineetgarg02 commented on a change in pull request #1231: URL: https://github.com/apache/hive/pull/1231#discussion_r454511428 ## File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOptimizer.java ## @@ -409,26 +409,54 @@ private boolean removeRSInsertedByEnforceBucketing(FileSinkOperator fsOp) { // and grand child if (found) { Operator rsParent = rsToRemove.getParentOperators().get(0); -Operator rsChild = rsToRemove.getChildOperators().get(0); -Operator rsGrandChild = rsChild.getChildOperators().get(0); - -if (rsChild instanceof SelectOperator) { - // if schema size cannot be matched, then it could be because of constant folding - // converting partition column expression to constant expression. The constant - // expression will then get pruned by column pruner since it will not reference to - // any columns. - if (rsParent.getSchema().getSignature().size() != - rsChild.getSchema().getSignature().size()) { +List> rsChildren = rsToRemove.getChildOperators(); + +Operator rsChildToRemove = null; + +for (Operator rsChild : rsChildren) { Review comment: @jcamachor I have addressed in latest comment. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 458797) Time Spent: 1h (was: 50m) > Sorted dynamic partition optimization could remove auto stat task > - > > Key: HIVE-23822 > URL: https://issues.apache.org/jira/browse/HIVE-23822 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > {{mm_dp}} has reproducer where INSERT query is missing auto stats task. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23822) Sorted dynamic partition optimization could remove auto stat task
[ https://issues.apache.org/jira/browse/HIVE-23822?focusedWorklogId=458743=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-458743 ] ASF GitHub Bot logged work on HIVE-23822: - Author: ASF GitHub Bot Created on: 14/Jul/20 15:58 Start Date: 14/Jul/20 15:58 Worklog Time Spent: 10m Work Description: jcamachor commented on a change in pull request #1231: URL: https://github.com/apache/hive/pull/1231#discussion_r454464296 ## File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOptimizer.java ## @@ -409,26 +409,54 @@ private boolean removeRSInsertedByEnforceBucketing(FileSinkOperator fsOp) { // and grand child if (found) { Operator rsParent = rsToRemove.getParentOperators().get(0); -Operator rsChild = rsToRemove.getChildOperators().get(0); -Operator rsGrandChild = rsChild.getChildOperators().get(0); - -if (rsChild instanceof SelectOperator) { - // if schema size cannot be matched, then it could be because of constant folding - // converting partition column expression to constant expression. The constant - // expression will then get pruned by column pruner since it will not reference to - // any columns. - if (rsParent.getSchema().getSignature().size() != - rsChild.getSchema().getSignature().size()) { +List> rsChildren = rsToRemove.getChildOperators(); + +Operator rsChildToRemove = null; + +for (Operator rsChild : rsChildren) { Review comment: Yes, there should not be a RS with multiple children, we can simplify that code. You can even add an assert to the new code to make sure. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 458743) Time Spent: 50m (was: 40m) > Sorted dynamic partition optimization could remove auto stat task > - > > Key: HIVE-23822 > URL: https://issues.apache.org/jira/browse/HIVE-23822 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > {{mm_dp}} has reproducer where INSERT query is missing auto stats task. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23822) Sorted dynamic partition optimization could remove auto stat task
[ https://issues.apache.org/jira/browse/HIVE-23822?focusedWorklogId=458740=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-458740 ] ASF GitHub Bot logged work on HIVE-23822: - Author: ASF GitHub Bot Created on: 14/Jul/20 15:55 Start Date: 14/Jul/20 15:55 Worklog Time Spent: 10m Work Description: vineetgarg02 commented on a change in pull request #1231: URL: https://github.com/apache/hive/pull/1231#discussion_r454462383 ## File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOptimizer.java ## @@ -409,26 +409,54 @@ private boolean removeRSInsertedByEnforceBucketing(FileSinkOperator fsOp) { // and grand child if (found) { Operator rsParent = rsToRemove.getParentOperators().get(0); -Operator rsChild = rsToRemove.getChildOperators().get(0); -Operator rsGrandChild = rsChild.getChildOperators().get(0); - -if (rsChild instanceof SelectOperator) { - // if schema size cannot be matched, then it could be because of constant folding - // converting partition column expression to constant expression. The constant - // expression will then get pruned by column pruner since it will not reference to - // any columns. - if (rsParent.getSchema().getSignature().size() != - rsChild.getSchema().getSignature().size()) { +List> rsChildren = rsToRemove.getChildOperators(); + +Operator rsChildToRemove = null; + +for (Operator rsChild : rsChildren) { Review comment: I assumed that this could be possibility and therefore accounted for it, but if this assumption is wrong I update the code. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 458740) Time Spent: 40m (was: 0.5h) > Sorted dynamic partition optimization could remove auto stat task > - > > Key: HIVE-23822 > URL: https://issues.apache.org/jira/browse/HIVE-23822 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > {{mm_dp}} has reproducer where INSERT query is missing auto stats task. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23822) Sorted dynamic partition optimization could remove auto stat task
[ https://issues.apache.org/jira/browse/HIVE-23822?focusedWorklogId=458449=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-458449 ] ASF GitHub Bot logged work on HIVE-23822: - Author: ASF GitHub Bot Created on: 14/Jul/20 04:28 Start Date: 14/Jul/20 04:28 Worklog Time Spent: 10m Work Description: jcamachor commented on a change in pull request #1231: URL: https://github.com/apache/hive/pull/1231#discussion_r454092186 ## File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOptimizer.java ## @@ -409,26 +409,54 @@ private boolean removeRSInsertedByEnforceBucketing(FileSinkOperator fsOp) { // and grand child if (found) { Operator rsParent = rsToRemove.getParentOperators().get(0); -Operator rsChild = rsToRemove.getChildOperators().get(0); -Operator rsGrandChild = rsChild.getChildOperators().get(0); - -if (rsChild instanceof SelectOperator) { - // if schema size cannot be matched, then it could be because of constant folding - // converting partition column expression to constant expression. The constant - // expression will then get pruned by column pruner since it will not reference to - // any columns. - if (rsParent.getSchema().getSignature().size() != - rsChild.getSchema().getSignature().size()) { +List> rsChildren = rsToRemove.getChildOperators(); + +Operator rsChildToRemove = null; + +for (Operator rsChild : rsChildren) { Review comment: In which case would we have a RS with multiple children? I thought this would never happen. Can we leave a comment explaining it? Otherwise, we should add a Precondition with number of children 1. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 458449) Time Spent: 0.5h (was: 20m) > Sorted dynamic partition optimization could remove auto stat task > - > > Key: HIVE-23822 > URL: https://issues.apache.org/jira/browse/HIVE-23822 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > {{mm_dp}} has reproducer where INSERT query is missing auto stats task. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23822) Sorted dynamic partition optimization could remove auto stat task
[ https://issues.apache.org/jira/browse/HIVE-23822?focusedWorklogId=458448=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-458448 ] ASF GitHub Bot logged work on HIVE-23822: - Author: ASF GitHub Bot Created on: 14/Jul/20 04:27 Start Date: 14/Jul/20 04:27 Worklog Time Spent: 10m Work Description: jcamachor commented on a change in pull request #1231: URL: https://github.com/apache/hive/pull/1231#discussion_r454092186 ## File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOptimizer.java ## @@ -409,26 +409,54 @@ private boolean removeRSInsertedByEnforceBucketing(FileSinkOperator fsOp) { // and grand child if (found) { Operator rsParent = rsToRemove.getParentOperators().get(0); -Operator rsChild = rsToRemove.getChildOperators().get(0); -Operator rsGrandChild = rsChild.getChildOperators().get(0); - -if (rsChild instanceof SelectOperator) { - // if schema size cannot be matched, then it could be because of constant folding - // converting partition column expression to constant expression. The constant - // expression will then get pruned by column pruner since it will not reference to - // any columns. - if (rsParent.getSchema().getSignature().size() != - rsChild.getSchema().getSignature().size()) { +List> rsChildren = rsToRemove.getChildOperators(); + +Operator rsChildToRemove = null; + +for (Operator rsChild : rsChildren) { Review comment: In which case would we have a RS with multiple children? Can we leave a comment explaining it? Otherwise, we should add a Precondition with number of children 1. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 458448) Time Spent: 20m (was: 10m) > Sorted dynamic partition optimization could remove auto stat task > - > > Key: HIVE-23822 > URL: https://issues.apache.org/jira/browse/HIVE-23822 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > {{mm_dp}} has reproducer where INSERT query is missing auto stats task. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23822) Sorted dynamic partition optimization could remove auto stat task
[ https://issues.apache.org/jira/browse/HIVE-23822?focusedWorklogId=456363=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-456363 ] ASF GitHub Bot logged work on HIVE-23822: - Author: ASF GitHub Bot Created on: 08/Jul/20 23:00 Start Date: 08/Jul/20 23:00 Worklog Time Spent: 10m Work Description: vineetgarg02 opened a new pull request #1231: URL: https://github.com/apache/hive/pull/1231 …at task ## NOTICE Please create an issue in ASF JIRA before opening a pull request, and you need to set the title of the pull request which starts with the corresponding JIRA issue number. (e.g. HIVE-X: Fix a typo in YYY) For more details, please see https://cwiki.apache.org/confluence/display/Hive/HowToContribute This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 456363) Remaining Estimate: 0h Time Spent: 10m > Sorted dynamic partition optimization could remove auto stat task > - > > Key: HIVE-23822 > URL: https://issues.apache.org/jira/browse/HIVE-23822 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > {{mm_dp}} has reproducer where INSERT query is missing auto stats task. -- This message was sent by Atlassian Jira (v8.3.4#803005)