Amit Sela created SPARK-16474:
---------------------------------

             Summary: Global Aggregation doesn't seem to work at all 
                 Key: SPARK-16474
                 URL: https://issues.apache.org/jira/browse/SPARK-16474
             Project: Spark
          Issue Type: Sub-task
          Components: SQL
    Affects Versions: 1.6.2, 2.0.0
            Reporter: Amit Sela


Executing a global aggregation (not grouped by key) fails.
Take the following code for example:
{code}
    val session = SparkSession.builder()
                              .appName("TestGlobalAggregator")
                              .master("local[*]")
                              .getOrCreate()
    import session.implicits._

    val ds1 = List(1, 2, 3).toDS
    val ds2 = ds1.agg(
          new Aggregator[Int, Int, Int]{

      def zero: Int = 0

      def reduce(b: Int, a: Int): Int = b + a

      def merge(b1: Int, b2: Int): Int = b1 + b2

      def finish(reduction: Int): Int = reduction

      def bufferEncoder: Encoder[Int] = implicitly[Encoder[Int]]

      def outputEncoder: Encoder[Int] = implicitly[Encoder[Int]]
    }.toColumn)

    ds2.printSchema
    ds2.show
{code}
I would expect the result to be 6, but instead I get the following exception:
{noformat}
java.lang.ClassCastException: 
org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema cannot be cast 
to java.lang.Integer 
at scala.runtime.BoxesRunTime.unboxToInt(BoxesRunTime.java:101)
.........
{noformat}
Trying the same code on DataFrames in 1.6.2 results in:
{noformat}
Exception in thread "main" org.apache.spark.sql.AnalysisException: unresolved 
operator 'Aggregate [(anon$1(),mode=Complete,isDistinct=false) AS anon$1()#8]; 
at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:38)
..........
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to