[ 
https://issues.apache.org/jira/browse/HIVE-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13566898#comment-13566898
 ] 

Phabricator commented on HIVE-2340:
-----------------------------------

hagleitn has commented on the revision "HIVE-2340 [jira] optimize orderby 
followed by a groupby".

  Partial review

INLINE COMMENTS
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java:521 Not sure why 
this is needed or why this defaults to 4. From comment below it seems this is 
just to avoid the single reducer order-by case for performance reasons, is that 
correct?
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkDeDuplication.java:787
 Is this required or extra protection? Comment at the top of the file says 
mapjoin optimization happens before this (and probably should for performance 
reasons). Also, if I understand it correctly "joinAndSort" might be a better 
name than "fixed". You're basically saying that if an optimization wants to 
change the join after this they need to make sure the ordering of the keys is 
preserved, right?
  
ql/src/java/org/apache/hadoop/hive/ql/ppd/PredicateTransitivePropagate.java:136 
seems orthogonal to this patch.
  ql/src/test/queries/clientpositive/reduce_deduplicate.q:7 There are not a lot 
of tests, for min.reducer=1. No order by case for instance. Maybe the 
reduce_deduplicate_extended.q should run with both default and min.reducer=1.

REVISION DETAIL
  https://reviews.facebook.net/D1209

To: JIRA, navis
Cc: hagleitn

                
> optimize orderby followed by a groupby
> --------------------------------------
>
>                 Key: HIVE-2340
>                 URL: https://issues.apache.org/jira/browse/HIVE-2340
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Query Processor
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Minor
>              Labels: perfomance
>         Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.1.patch, 
> ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.2.patch, 
> ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.3.patch, 
> ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.4.patch, 
> ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.5.patch, HIVE-2340.1.patch.txt, 
> HIVE-2340.D1209.6.patch, HIVE-2340.D1209.7.patch, HIVE-2340.D1209.8.patch, 
> HIVE-2340.D1209.9.patch, testclidriver.txt
>
>
> Before implementing optimizer for JOIN-GBY, try to implement RS-GBY 
> optimizer(cluster-by following group-by).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to