[jira] [Commented] (HIVE-948) more query plan optimization rules

2013-02-10 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13575403#comment-13575403
 ] 

Ashutosh Chauhan commented on HIVE-948:
---

Comments on phabricator.

> more query plan optimization rules 
> ---
>
> Key: HIVE-948
> URL: https://issues.apache.org/jira/browse/HIVE-948
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ning Zhang
>Assignee: Navis
> Attachments: HIVE-948.D8463.1.patch
>
>
> Many query plans are not optimal in that they contain redundant operators. 
> Some examples are unnecessary select operators (select followed by select, 
> select output being the same as input etc.). Even though these operators are 
> not very expensive, they could account for around 10% of CPU time in some 
> simple queries. It seems they are low-hanging fruits that we should pick 
> first. 
> BTW, it seems these optimization rules should be added at the last stage of 
> the physical optimization phase since some redundant operators are added to 
> facilitate physical plan generation. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-948) more query plan optimization rules

2013-02-10 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13575404#comment-13575404
 ] 

Phabricator commented on HIVE-948:
--

ashutoshc has requested changes to the revision "HIVE-948 [jira] more query 
plan optimization rules".

  Some comments.

INLINE COMMENTS
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/CleanupProcessor.java:47 I 
think better name for this class could be NonBlockingOpDeDupProc ?
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/CleanupProcessor.java:91 It 
doesn't look like terminal is used. This should be removed.
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/CleanupProcessor.java:101 
Would also like to add cFil = null here?
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/CleanupProcessor.java:78 
looks like to be safe we should also add cSel = null here.
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/CleanupProcessor.java:88 It 
will be good to add comment why in case of sampling predicate we dont attempt 
to merge filters.
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/CleanupProcessor.java:75 Do 
you also want to add : pSEL.getConf().setSelectStar 
(pSel.getConf().isSelectStar() || cSEL.getConf().isSelectStar);

  And similarly for selStarNoCompute ?
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/CleanupProcessor.java:98 Do 
you also want to add 
pFIL.getConf().setSortedFilter(pFIL.getConf().getSortedFilter() || 
cFIL.getConf().getSortedFilter()); ?
  
ql/src/java/org/apache/hadoop/hive/ql/ppd/PredicateTransitivePropagate.java:76 
RS -> MJ is disallowed after HIVE-3784, so this part of this rule can be 
removed.
  
ql/src/java/org/apache/hadoop/hive/ql/ppd/PredicateTransitivePropagate.java:66 
Add annotation @Override

REVISION DETAIL
  https://reviews.facebook.net/D8463

BRANCH
  DPAL-1980

ARCANIST PROJECT
  hive

To: JIRA, ashutoshc, navis


> more query plan optimization rules 
> ---
>
> Key: HIVE-948
> URL: https://issues.apache.org/jira/browse/HIVE-948
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ning Zhang
>Assignee: Navis
> Attachments: HIVE-948.D8463.1.patch
>
>
> Many query plans are not optimal in that they contain redundant operators. 
> Some examples are unnecessary select operators (select followed by select, 
> select output being the same as input etc.). Even though these operators are 
> not very expensive, they could account for around 10% of CPU time in some 
> simple queries. It seems they are low-hanging fruits that we should pick 
> first. 
> BTW, it seems these optimization rules should be added at the last stage of 
> the physical optimization phase since some redundant operators are added to 
> facilitate physical plan generation. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-948) more query plan optimization rules

2013-02-11 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13576276#comment-13576276
 ] 

Phabricator commented on HIVE-948:
--

navis has commented on the revision "HIVE-948 [jira] more query plan 
optimization rules".

INLINE COMMENTS
  
ql/src/java/org/apache/hadoop/hive/ql/ppd/PredicateTransitivePropagate.java:76 
I'll check that.
  
ql/src/java/org/apache/hadoop/hive/ql/ppd/PredicateTransitivePropagate.java:66 
ok.
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/CleanupProcessor.java:47 I 
thought other cleanups can be appended to this optimizer. For that, 
NonBlockingOpDeDupProc seemed a little specific.
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/CleanupProcessor.java:75 I 
didn't checked other configs. I'll do that.
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/CleanupProcessor.java:78 
Nullifying local variable? ok.
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/CleanupProcessor.java:88 ok.
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/CleanupProcessor.java:91 ok.
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/CleanupProcessor.java:98 I've 
missed. Those conditions should be checked.
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/CleanupProcessor.java:101 ok.

REVISION DETAIL
  https://reviews.facebook.net/D8463

BRANCH
  DPAL-1980

ARCANIST PROJECT
  hive

To: JIRA, ashutoshc, navis


> more query plan optimization rules 
> ---
>
> Key: HIVE-948
> URL: https://issues.apache.org/jira/browse/HIVE-948
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ning Zhang
>Assignee: Navis
> Attachments: HIVE-948.D8463.1.patch
>
>
> Many query plans are not optimal in that they contain redundant operators. 
> Some examples are unnecessary select operators (select followed by select, 
> select output being the same as input etc.). Even though these operators are 
> not very expensive, they could account for around 10% of CPU time in some 
> simple queries. It seems they are low-hanging fruits that we should pick 
> first. 
> BTW, it seems these optimization rules should be added at the last stage of 
> the physical optimization phase since some redundant operators are added to 
> facilitate physical plan generation. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-948) more query plan optimization rules

2013-02-13 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13577674#comment-13577674
 ] 

Phabricator commented on HIVE-948:
--

ashutoshc has requested changes to the revision "HIVE-948 [jira] more query 
plan optimization rules".

  Mostly looks good. Have couple more comments. But I do think we need to name 
it better than CleanupProcessor.

INLINE COMMENTS
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/CleanupProcessor.java:47 
CleanupProcessor is not indicative of kind of optimization its doing. Its not 
descriptive enough. NonBlockingOpDeDupProc may not be an optimal choice. But we 
do need better than CleanupProcessor.
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/CleanupProcessor.java:83 What 
if parent isSelectStar and child is not? Shouldn't the surviving operator 
(parent) should be selectStar if either parent or child is.
  
ql/src/java/org/apache/hadoop/hive/ql/ppd/PredicateTransitivePropagate.java:75 
Per Namit's comment here: 
https://reviews.facebook.net/D5325?id=26877#inline-39651 RS->MJ is not allowed 
anymore. So, we should get rid of second predicate of this OR rule here.

REVISION DETAIL
  https://reviews.facebook.net/D8463

BRANCH
  DPAL-1980

ARCANIST PROJECT
  hive

To: JIRA, ashutoshc, navis


> more query plan optimization rules 
> ---
>
> Key: HIVE-948
> URL: https://issues.apache.org/jira/browse/HIVE-948
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ning Zhang
>Assignee: Navis
> Attachments: HIVE-948.D8463.1.patch, HIVE-948.D8463.2.patch
>
>
> Many query plans are not optimal in that they contain redundant operators. 
> Some examples are unnecessary select operators (select followed by select, 
> select output being the same as input etc.). Even though these operators are 
> not very expensive, they could account for around 10% of CPU time in some 
> simple queries. It seems they are low-hanging fruits that we should pick 
> first. 
> BTW, it seems these optimization rules should be added at the last stage of 
> the physical optimization phase since some redundant operators are added to 
> facilitate physical plan generation. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-948) more query plan optimization rules

2013-02-14 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13578920#comment-13578920
 ] 

Phabricator commented on HIVE-948:
--

navis has commented on the revision "HIVE-948 [jira] more query plan 
optimization rules".

INLINE COMMENTS
  
ql/src/java/org/apache/hadoop/hive/ql/ppd/PredicateTransitivePropagate.java:75 
I don't know well about the issue, but there is still a rule(R7:MAPJOIN%) about 
mapjoin in genMapRedTasks() method in SemanticAnalyzer. what's that?

  I'll remove that.
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/CleanupProcessor.java:47 ok, 
NonBlockingOpDeDupProc will be good. When we need some other cleaning ups, we 
can rename it or do other things.

REVISION DETAIL
  https://reviews.facebook.net/D8463

BRANCH
  DPAL-1980

ARCANIST PROJECT
  hive

To: JIRA, ashutoshc, navis


> more query plan optimization rules 
> ---
>
> Key: HIVE-948
> URL: https://issues.apache.org/jira/browse/HIVE-948
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ning Zhang
>Assignee: Navis
> Attachments: HIVE-948.D8463.1.patch, HIVE-948.D8463.2.patch
>
>
> Many query plans are not optimal in that they contain redundant operators. 
> Some examples are unnecessary select operators (select followed by select, 
> select output being the same as input etc.). Even though these operators are 
> not very expensive, they could account for around 10% of CPU time in some 
> simple queries. It seems they are low-hanging fruits that we should pick 
> first. 
> BTW, it seems these optimization rules should be added at the last stage of 
> the physical optimization phase since some redundant operators are added to 
> facilitate physical plan generation. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-948) more query plan optimization rules

2013-02-18 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13580784#comment-13580784
 ] 

Ashutosh Chauhan commented on HIVE-948:
---

New file NonBlockingOpDedup.java is not in phabricator, so commenting here:
 // updates schema only (this should be the last optimizer modifying operator 
tree)
+pSEL.setSchema(cSEL.getSchema());
+  }
* Why  this needs to be last optimizer? Please add more comments for the reason 
for that. Even if it is, currently in Optimizer.java it is not.
* Also, parent should always have child's schema, isnt it? If so, why you have 
it in within if () block.

+
+  pSEL.getConf().setSelectStar(cSEL.getConf().isSelectStar());
* Shouldn't parent be selectStar either when child is select-star or parent 
itself is select-star. e.g, SEL(sel-star)-SEL(no-sel-star). So, this should be 
pSEL.getConf().setSelectStar(cSEL.getConf().isSelectStar() || 
pSEL.getConf().isSelectStar()); If not, please add comments for it, since it 
wasn't obvious.


> more query plan optimization rules 
> ---
>
> Key: HIVE-948
> URL: https://issues.apache.org/jira/browse/HIVE-948
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ning Zhang
>Assignee: Navis
> Attachments: HIVE-948.D8463.1.patch, HIVE-948.D8463.2.patch, 
> HIVE-948.D8463.3.patch, HIVE-948.D8463.3.patch
>
>
> Many query plans are not optimal in that they contain redundant operators. 
> Some examples are unnecessary select operators (select followed by select, 
> select output being the same as input etc.). Even though these operators are 
> not very expensive, they could account for around 10% of CPU time in some 
> simple queries. It seems they are low-hanging fruits that we should pick 
> first. 
> BTW, it seems these optimization rules should be added at the last stage of 
> the physical optimization phase since some redundant operators are added to 
> facilitate physical plan generation. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-948) more query plan optimization rules

2013-02-20 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13582880#comment-13582880
 ] 

Navis commented on HIVE-948:


Ah, sorry. I'l update that.

bq. Why this needs to be last optimizer?
It's not updating infos for the SEL including colExprMap, etc. Following 
optimizers like GlobalLimitOptimizer or SimpleFetchOptimizer does not  modify 
operator tree. (Possibly update infos, but I was even thinking of removing all 
of them as a CleanupProcessor, making the plan file smaller)

bq. Also, parent should always have child's schema, isnt it?
I thought SEL(no-compute) does not have schema because it just inherits that of 
parent. I'll check it again.

bq. Shouldn't parent be selectStar either when child is select-star or parent 
itself is select-star.
I've escaped those situations before applying it like this (in the missing 
file), cause I'm not sure of it.
{code}
if (pSEL.getConf().isSelStarNoCompute()) {
  // SEL(no-compute)-SEL. never seen this condition, and removing parent is not 
safe in current graph walker
  return null;
}
{code}

> more query plan optimization rules 
> ---
>
> Key: HIVE-948
> URL: https://issues.apache.org/jira/browse/HIVE-948
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ning Zhang
>Assignee: Navis
> Attachments: HIVE-948.D8463.1.patch, HIVE-948.D8463.2.patch, 
> HIVE-948.D8463.3.patch, HIVE-948.D8463.3.patch
>
>
> Many query plans are not optimal in that they contain redundant operators. 
> Some examples are unnecessary select operators (select followed by select, 
> select output being the same as input etc.). Even though these operators are 
> not very expensive, they could account for around 10% of CPU time in some 
> simple queries. It seems they are low-hanging fruits that we should pick 
> first. 
> BTW, it seems these optimization rules should be added at the last stage of 
> the physical optimization phase since some redundant operators are added to 
> facilitate physical plan generation. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-948) more query plan optimization rules

2013-02-20 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13582973#comment-13582973
 ] 

Ashutosh Chauhan commented on HIVE-948:
---

Makes sense. Navis, once you update the patch (there are few more .q files 
which were added in trunk since you last updated the patch), I will get it in. 

> more query plan optimization rules 
> ---
>
> Key: HIVE-948
> URL: https://issues.apache.org/jira/browse/HIVE-948
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ning Zhang
>Assignee: Navis
> Attachments: HIVE-948.D8463.1.patch, HIVE-948.D8463.2.patch, 
> HIVE-948.D8463.3.patch, HIVE-948.D8463.3.patch, HIVE-948.D8463.4.patch
>
>
> Many query plans are not optimal in that they contain redundant operators. 
> Some examples are unnecessary select operators (select followed by select, 
> select output being the same as input etc.). Even though these operators are 
> not very expensive, they could account for around 10% of CPU time in some 
> simple queries. It seems they are low-hanging fruits that we should pick 
> first. 
> BTW, it seems these optimization rules should be added at the last stage of 
> the physical optimization phase since some redundant operators are added to 
> facilitate physical plan generation. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-948) more query plan optimization rules

2013-02-21 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13584030#comment-13584030
 ] 

Ashutosh Chauhan commented on HIVE-948:
---

+1 running tests.

> more query plan optimization rules 
> ---
>
> Key: HIVE-948
> URL: https://issues.apache.org/jira/browse/HIVE-948
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ning Zhang
>Assignee: Navis
> Attachments: HIVE-948.D8463.1.patch, HIVE-948.D8463.2.patch, 
> HIVE-948.D8463.3.patch, HIVE-948.D8463.3.patch, HIVE-948.D8463.4.patch, 
> HIVE-948.testresult_only.txt
>
>
> Many query plans are not optimal in that they contain redundant operators. 
> Some examples are unnecessary select operators (select followed by select, 
> select output being the same as input etc.). Even though these operators are 
> not very expensive, they could account for around 10% of CPU time in some 
> simple queries. It seems they are low-hanging fruits that we should pick 
> first. 
> BTW, it seems these optimization rules should be added at the last stage of 
> the physical optimization phase since some redundant operators are added to 
> facilitate physical plan generation. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-948) more query plan optimization rules

2013-02-21 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13584054#comment-13584054
 ] 

Namit Jain commented on HIVE-948:
-

comments

> more query plan optimization rules 
> ---
>
> Key: HIVE-948
> URL: https://issues.apache.org/jira/browse/HIVE-948
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ning Zhang
>Assignee: Navis
> Attachments: HIVE-948.D8463.1.patch, HIVE-948.D8463.2.patch, 
> HIVE-948.D8463.3.patch, HIVE-948.D8463.3.patch, HIVE-948.D8463.4.patch, 
> HIVE-948.testresult_only.txt
>
>
> Many query plans are not optimal in that they contain redundant operators. 
> Some examples are unnecessary select operators (select followed by select, 
> select output being the same as input etc.). Even though these operators are 
> not very expensive, they could account for around 10% of CPU time in some 
> simple queries. It seems they are low-hanging fruits that we should pick 
> first. 
> BTW, it seems these optimization rules should be added at the last stage of 
> the physical optimization phase since some redundant operators are added to 
> facilitate physical plan generation. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-948) more query plan optimization rules

2013-02-21 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13584055#comment-13584055
 ] 

Phabricator commented on HIVE-948:
--

njain has commented on the revision "HIVE-948 [jira] more query plan 
optimization rules".

INLINE COMMENTS
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java:98 I missed 
this patch -

  Had some questions.
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/NonBlockingOpDeDupProc.java:93 
Have you missed some files - this function

  removeChild... does not seem to exist
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/NonBlockingOpDeDupProc.java:80 
You are checking for:

  SEL-SEL(compute), the comment looks wrong

REVISION DETAIL
  https://reviews.facebook.net/D8463

To: JIRA, ashutoshc, navis
Cc: njain


> more query plan optimization rules 
> ---
>
> Key: HIVE-948
> URL: https://issues.apache.org/jira/browse/HIVE-948
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ning Zhang
>Assignee: Navis
> Attachments: HIVE-948.D8463.1.patch, HIVE-948.D8463.2.patch, 
> HIVE-948.D8463.3.patch, HIVE-948.D8463.3.patch, HIVE-948.D8463.4.patch, 
> HIVE-948.testresult_only.txt
>
>
> Many query plans are not optimal in that they contain redundant operators. 
> Some examples are unnecessary select operators (select followed by select, 
> select output being the same as input etc.). Even though these operators are 
> not very expensive, they could account for around 10% of CPU time in some 
> simple queries. It seems they are low-hanging fruits that we should pick 
> first. 
> BTW, it seems these optimization rules should be added at the last stage of 
> the physical optimization phase since some redundant operators are added to 
> facilitate physical plan generation. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-948) more query plan optimization rules

2013-02-22 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13584374#comment-13584374
 ] 

Ashutosh Chauhan commented on HIVE-948:
---

Following queries failed:
* testCliDriver_auto_smb_mapjoin_14
* testCliDriver_binarysortable_1
* testCliDriver_udf_reflect2

For last two queries, simply need to update .q.out files.

> more query plan optimization rules 
> ---
>
> Key: HIVE-948
> URL: https://issues.apache.org/jira/browse/HIVE-948
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ning Zhang
>Assignee: Navis
> Attachments: HIVE-948.D8463.1.patch, HIVE-948.D8463.2.patch, 
> HIVE-948.D8463.3.patch, HIVE-948.D8463.3.patch, HIVE-948.D8463.4.patch, 
> HIVE-948.testresult_only.txt
>
>
> Many query plans are not optimal in that they contain redundant operators. 
> Some examples are unnecessary select operators (select followed by select, 
> select output being the same as input etc.). Even though these operators are 
> not very expensive, they could account for around 10% of CPU time in some 
> simple queries. It seems they are low-hanging fruits that we should pick 
> first. 
> BTW, it seems these optimization rules should be added at the last stage of 
> the physical optimization phase since some redundant operators are added to 
> facilitate physical plan generation. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-948) more query plan optimization rules

2013-02-23 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13585084#comment-13585084
 ] 

Phabricator commented on HIVE-948:
--

navis has commented on the revision "HIVE-948 [jira] more query plan 
optimization rules".

INLINE COMMENTS
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/NonBlockingOpDeDupProc.java:93 
?? The method is added by HIVE-1750, which was committed by you.
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/NonBlockingOpDeDupProc.java:80 
Will be fixed.

REVISION DETAIL
  https://reviews.facebook.net/D8463

To: JIRA, ashutoshc, navis
Cc: njain


> more query plan optimization rules 
> ---
>
> Key: HIVE-948
> URL: https://issues.apache.org/jira/browse/HIVE-948
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ning Zhang
>Assignee: Navis
> Attachments: HIVE-948.D8463.1.patch, HIVE-948.D8463.2.patch, 
> HIVE-948.D8463.3.patch, HIVE-948.D8463.3.patch, HIVE-948.D8463.4.patch, 
> HIVE-948.testresult_only.txt
>
>
> Many query plans are not optimal in that they contain redundant operators. 
> Some examples are unnecessary select operators (select followed by select, 
> select output being the same as input etc.). Even though these operators are 
> not very expensive, they could account for around 10% of CPU time in some 
> simple queries. It seems they are low-hanging fruits that we should pick 
> first. 
> BTW, it seems these optimization rules should be added at the last stage of 
> the physical optimization phase since some redundant operators are added to 
> facilitate physical plan generation. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-948) more query plan optimization rules

2013-02-23 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13585085#comment-13585085
 ] 

Navis commented on HIVE-948:


I've fixed failure of auto_smb_mapjoin_14 before, which is missing on current 
source code. Sorry.
Running test.

> more query plan optimization rules 
> ---
>
> Key: HIVE-948
> URL: https://issues.apache.org/jira/browse/HIVE-948
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ning Zhang
>Assignee: Navis
> Attachments: HIVE-948.D8463.1.patch, HIVE-948.D8463.2.patch, 
> HIVE-948.D8463.3.patch, HIVE-948.D8463.3.patch, HIVE-948.D8463.4.patch, 
> HIVE-948.testresult_only.txt
>
>
> Many query plans are not optimal in that they contain redundant operators. 
> Some examples are unnecessary select operators (select followed by select, 
> select output being the same as input etc.). Even though these operators are 
> not very expensive, they could account for around 10% of CPU time in some 
> simple queries. It seems they are low-hanging fruits that we should pick 
> first. 
> BTW, it seems these optimization rules should be added at the last stage of 
> the physical optimization phase since some redundant operators are added to 
> facilitate physical plan generation. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-948) more query plan optimization rules

2013-02-23 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13585269#comment-13585269
 ] 

Navis commented on HIVE-948:


Making the result of "udf_reflect2", I've realized that it's better not to 
merge two SEL operators if child SEL references column of parent SEL which is 
result of function twice or more. For example,

select reflect2(ts, "getYear"), reflect2(ts, "getMonth") from (select cast(key 
as timestamp) as ts from tbl) a;


> more query plan optimization rules 
> ---
>
> Key: HIVE-948
> URL: https://issues.apache.org/jira/browse/HIVE-948
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ning Zhang
>Assignee: Navis
> Attachments: HIVE-948.D8463.1.patch, HIVE-948.D8463.2.patch, 
> HIVE-948.D8463.3.patch, HIVE-948.D8463.3.patch, HIVE-948.D8463.4.patch, 
> HIVE-948.D8463.5.patch, HIVE-948.testresult_only.txt
>
>
> Many query plans are not optimal in that they contain redundant operators. 
> Some examples are unnecessary select operators (select followed by select, 
> select output being the same as input etc.). Even though these operators are 
> not very expensive, they could account for around 10% of CPU time in some 
> simple queries. It seems they are low-hanging fruits that we should pick 
> first. 
> BTW, it seems these optimization rules should be added at the last stage of 
> the physical optimization phase since some redundant operators are added to 
> facilitate physical plan generation. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-948) more query plan optimization rules

2013-02-25 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13585939#comment-13585939
 ] 

Ashutosh Chauhan commented on HIVE-948:
---

Thanks Navis for updating patch. Running tests again.

> more query plan optimization rules 
> ---
>
> Key: HIVE-948
> URL: https://issues.apache.org/jira/browse/HIVE-948
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ning Zhang
>Assignee: Navis
> Attachments: HIVE-948.D8463.1.patch, HIVE-948.D8463.2.patch, 
> HIVE-948.D8463.3.patch, HIVE-948.D8463.3.patch, HIVE-948.D8463.4.patch, 
> HIVE-948.D8463.5.patch, HIVE-948.testresult_only.1.txt
>
>
> Many query plans are not optimal in that they contain redundant operators. 
> Some examples are unnecessary select operators (select followed by select, 
> select output being the same as input etc.). Even though these operators are 
> not very expensive, they could account for around 10% of CPU time in some 
> simple queries. It seems they are low-hanging fruits that we should pick 
> first. 
> BTW, it seems these optimization rules should be added at the last stage of 
> the physical optimization phase since some redundant operators are added to 
> facilitate physical plan generation. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-948) more query plan optimization rules

2013-02-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13586789#comment-13586789
 ] 

Hudson commented on HIVE-948:
-

Integrated in Hive-trunk-h0.21 #1987 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/1987/])
HIVE-948: more query plan optimization rules (Navis via Ashutosh Chauhan) 
(Revision 1449981)

 Result = FAILURE
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1449981
Files : 
* /hive/trunk/contrib/src/test/results/clientpositive/serde_typedbytes.q.out
* /hive/trunk/contrib/src/test/results/clientpositive/serde_typedbytes5.q.out
* /hive/trunk/hbase-handler/src/test/results/positive/hbase_queries.q.out
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/NonBlockingOpDeDupProc.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/PredicateTransitivePropagate.java
* /hive/trunk/ql/src/test/queries/clientpositive/nonblock_op_deduplicate.q
* /hive/trunk/ql/src/test/results/clientnegative/bucket_mapjoin_mismatch1.q.out
* 
/hive/trunk/ql/src/test/results/clientnegative/sortmerge_mapjoin_mismatch_1.q.out
* /hive/trunk/ql/src/test/results/clientpositive/alias_casted_column.q.out
* /hive/trunk/ql/src/test/results/clientpositive/ambiguous_col.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_join1.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_join12.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_join14_hadoop20.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_join17.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_join19.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_join2.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_join20.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_join22.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_join26.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_join28.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_join29.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_join3.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_join4.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_join5.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_join6.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_join7.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_join8.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_join9.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_smb_mapjoin_14.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_sortmerge_join_9.q.out
* /hive/trunk/ql/src/test/results/clientpositive/binarysortable_1.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucket_groupby.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucket_map_join_1.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucket_map_join_2.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_1.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_2.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_3.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_4.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_5.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_6.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_7.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_8.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketizedhiveinputformat.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin1.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin10.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin11.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin12.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin13.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin2.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin3.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin4.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin5.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin7.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin8.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin9.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin_negative.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin_negative2.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketma

[jira] [Commented] (HIVE-948) more query plan optimization rules

2013-04-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13623036#comment-13623036
 ] 

Hudson commented on HIVE-948:
-

Integrated in Hive-trunk-hadoop2 #138 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/138/])
HIVE-948: more query plan optimization rules (Navis via Ashutosh Chauhan) 
(Revision 1449981)

 Result = FAILURE
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1449981
Files : 
* /hive/trunk/contrib/src/test/results/clientpositive/serde_typedbytes.q.out
* /hive/trunk/contrib/src/test/results/clientpositive/serde_typedbytes5.q.out
* /hive/trunk/hbase-handler/src/test/results/positive/hbase_queries.q.out
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/NonBlockingOpDeDupProc.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/PredicateTransitivePropagate.java
* /hive/trunk/ql/src/test/queries/clientpositive/nonblock_op_deduplicate.q
* /hive/trunk/ql/src/test/results/clientnegative/bucket_mapjoin_mismatch1.q.out
* 
/hive/trunk/ql/src/test/results/clientnegative/sortmerge_mapjoin_mismatch_1.q.out
* /hive/trunk/ql/src/test/results/clientpositive/alias_casted_column.q.out
* /hive/trunk/ql/src/test/results/clientpositive/ambiguous_col.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_join1.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_join12.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_join14_hadoop20.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_join17.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_join19.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_join2.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_join20.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_join22.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_join26.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_join28.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_join29.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_join3.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_join4.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_join5.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_join6.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_join7.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_join8.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_join9.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_smb_mapjoin_14.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_sortmerge_join_9.q.out
* /hive/trunk/ql/src/test/results/clientpositive/binarysortable_1.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucket_groupby.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucket_map_join_1.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucket_map_join_2.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_1.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_2.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_3.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_4.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_5.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_6.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_7.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_8.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketizedhiveinputformat.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin1.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin10.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin11.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin12.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin13.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin2.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin3.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin4.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin5.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin7.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin8.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin9.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin_negative.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin_negative2.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucket