In the following example, after I used "typed.avg" to generate a TypedColumn, and I want to apply round on top of it? But why Spark complains about it? Because it doesn't know that it is a TypedColumn<Token, Double>?
Thanks Yong scala> spark.version res20: String = 2.1.0 scala> case class Token(name: String, productId: Int, score: Double) defined class Token scala> val data = Token("aaa", 100, 0.12) :: Token("aaa", 200, 0.29) :: Token("bbb", 200, 0.53) :: Token("bbb", 300, 0.42) :: Nil data: List[Token] = List(Token(aaa,100,0.12), Token(aaa,200,0.29), Token(bbb,200,0.53), Token(bbb,300,0.42)) scala> val dataset = data.toDS dataset: org.apache.spark.sql.Dataset[Token] = [name: string, productId: int ... 1 more field] scala> import org.apache.spark.sql.expressions.scalalang._ import org.apache.spark.sql.expressions.scalalang._ scala> dataset.groupByKey(_.productId).agg(typed.avg[Token](_.score)).show +-----+-----------------------------------------+ |value|TypedAverage($line22.$readiwiw$Token)| +-----+-----------------------------------------+ | 300| 0.42| | 100| 0.12| | 200| 0.41000000000000003| +-----+-----------------------------------------+ scala> dataset.groupByKey(_.productId).agg(round(typed.avg[Token](_.score))) <console>:36: error: type mismatch; found : org.apache.spark.sql.Column required: org.apache.spark.sql.TypedColumn[Token,?] dataset.groupByKey(_.productId).agg(round(typed.avg[Token](_.score)))