Amit Sela created SPARK-16474: --------------------------------- Summary: Global Aggregation doesn't seem to work at all Key: SPARK-16474 URL: https://issues.apache.org/jira/browse/SPARK-16474 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 1.6.2, 2.0.0 Reporter: Amit Sela
Executing a global aggregation (not grouped by key) fails. Take the following code for example: {code} val session = SparkSession.builder() .appName("TestGlobalAggregator") .master("local[*]") .getOrCreate() import session.implicits._ val ds1 = List(1, 2, 3).toDS val ds2 = ds1.agg( new Aggregator[Int, Int, Int]{ def zero: Int = 0 def reduce(b: Int, a: Int): Int = b + a def merge(b1: Int, b2: Int): Int = b1 + b2 def finish(reduction: Int): Int = reduction def bufferEncoder: Encoder[Int] = implicitly[Encoder[Int]] def outputEncoder: Encoder[Int] = implicitly[Encoder[Int]] }.toColumn) ds2.printSchema ds2.show {code} I would expect the result to be 6, but instead I get the following exception: {noformat} java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema cannot be cast to java.lang.Integer at scala.runtime.BoxesRunTime.unboxToInt(BoxesRunTime.java:101) ......... {noformat} Trying the same code on DataFrames in 1.6.2 results in: {noformat} Exception in thread "main" org.apache.spark.sql.AnalysisException: unresolved operator 'Aggregate [(anon$1(),mode=Complete,isDistinct=false) AS anon$1()#8]; at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:38) .......... {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org