[GitHub] spark pull request #21072: [SPARK-23973][SQL] Remove consecutive Sorts

henryr Sat, 14 Apr 2018 14:19:36 -0700

Github user henryr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21072#discussion_r181563918
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
    @@ -736,12 +736,15 @@ object EliminateSorts extends Rule[LogicalPlan] {
     }
     
     /**
    - * Removes Sort operation if the child is already sorted
    + * Removes redundant Sort operation. This can happen:
    + * 1) if the child is already sorted
    + * 2) if the next operator is a Sort itself
      */
     object RemoveRedundantSorts extends Rule[LogicalPlan] {
       def apply(plan: LogicalPlan): LogicalPlan = plan transform {
         case Sort(orders, true, child) if 
SortOrder.orderingSatisfies(child.outputOrdering, orders) =>
           child
    +    case s @ Sort(_, _, Sort(_, _, child)) => s.copy(child = child)
    --- End diff --
    
    Thanks for doing this! It might be useful to generalise this to any pair of 
sorts separated by 0 or more projections or filters. I did this for my 
SPARK-23975 PR, see: 
https://github.com/henryr/spark/commit/bb992c2058863322a9183b2985806a87729e4168#diff-a636a87d8843eeccca90140be91d4fafR322
    
    What do you think?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21072: [SPARK-23973][SQL] Remove consecutive Sorts

Reply via email to