spark git commit: [SPARK-6303][SQL] Remove unnecessary Average in GeneratedAggregate

marmbrus Mon, 13 Apr 2015 18:15:55 -0700

Repository: spark
Updated Branches:
  refs/heads/master d7f2c1986 -> 5b8b324f3



[SPARK-6303][SQL] Remove unnecessary Average in GeneratedAggregate

Because `Average` is a `PartialAggregate`, we never get a `Average` node when 
reaching `HashAggregation` to prepare `GeneratedAggregate`.

That is why in SQLQuerySuite there is already a test for `avg` with codegen. 
And it works.

But we can find a case in `GeneratedAggregate` to deal with `Average`. Based on 
the above, we actually never execute this case.

So we can remove this case from `GeneratedAggregate`.

Author: Liang-Chi Hsieh <vii...@gmail.com>

Closes #4996 from viirya/add_average_codegened and squashes the following 
commits:

621c12f [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' into 
add_average_codegened
368cfbc [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' into 
add_average_codegened
74926d1 [Liang-Chi Hsieh] Add Average in canBeCodeGened lists.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/5b8b324f
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/5b8b324f
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/5b8b324f

Branch: refs/heads/master
Commit: 5b8b324f33e857b95de65031334846a7ca26fa60
Parents: d7f2c19
Author: Liang-Chi Hsieh <vii...@gmail.com>
Authored: Mon Apr 13 18:15:29 2015 -0700
Committer: Michael Armbrust <mich...@databricks.com>
Committed: Mon Apr 13 18:15:29 2015 -0700

----------------------------------------------------------------------
 .../sql/execution/GeneratedAggregate.scala      | 45 --------------------
 1 file changed, 45 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/5b8b324f/sql/core/src/main/scala/org/apache/spark/sql/execution/GeneratedAggregate.scala
----------------------------------------------------------------------
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/GeneratedAggregate.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/GeneratedAggregate.scala
index 95176e4..b510cf0 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/GeneratedAggregate.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/GeneratedAggregate.scala
@@ -153,51 +153,6 @@ case class GeneratedAggregate(
 
         AggregateEvaluation(currentSum :: Nil, initialValue :: Nil, 
updateFunction :: Nil, result)
         
-      case a @ Average(expr) =>
-        val calcType =
-          expr.dataType match {
-            case DecimalType.Fixed(_, _) =>
-              DecimalType.Unlimited
-            case _ =>
-              expr.dataType
-          }
-
-        val currentCount = AttributeReference("currentCount", LongType, 
nullable = false)()
-        val currentSum = AttributeReference("currentSum", calcType, nullable = 
false)()
-        val initialCount = Literal(0L)
-        val initialSum = Cast(Literal(0L), calcType)
-
-        // If we're evaluating UnscaledValue(x), we can do Count on x 
directly, since its
-        // UnscaledValue will be null if and only if x is null; helps with 
Average on decimals
-        val toCount = expr match {
-          case UnscaledValue(e) => e
-          case _ => expr
-        }
-
-        val updateCount = If(IsNotNull(toCount), Add(currentCount, 
Literal(1L)), currentCount)
-        val updateSum = Coalesce(Add(Cast(expr, calcType), currentSum) :: 
currentSum :: Nil)
-
-        val result =
-          expr.dataType match {
-            case DecimalType.Fixed(_, _) =>
-              If(EqualTo(currentCount, Literal(0L)),
-                Literal.create(null, a.dataType),
-                Cast(Divide(
-                  Cast(currentSum, DecimalType.Unlimited),
-                  Cast(currentCount, DecimalType.Unlimited)), a.dataType))
-            case _ =>
-              If(EqualTo(currentCount, Literal(0L)),
-                Literal.create(null, a.dataType),
-                Divide(Cast(currentSum, a.dataType), Cast(currentCount, 
a.dataType)))
-          }
-
-        AggregateEvaluation(
-          currentCount :: currentSum :: Nil,
-          initialCount :: initialSum :: Nil,
-          updateCount :: updateSum :: Nil,
-          result
-        )
-
       case m @ Max(expr) =>
         val currentMax = AttributeReference("currentMax", expr.dataType, 
nullable = true)()
         val initialValue = Literal.create(null, expr.dataType)


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-6303][SQL] Remove unnecessary Average in GeneratedAggregate

Reply via email to