[ https://issues.apache.org/jira/browse/SPARK-20411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15992517#comment-15992517 ]
Jason Moore commented on SPARK-20411: ------------------------------------- And, ideally, anything else within org.apache.spark.sql.functions (e.g. countDistinct). We're looking to replace our use of DataFrames with Datasets, which means finding a replacement for all the aggregation functions that we use. If I end up putting together some functions myself, I'll pop back here to contribute them. > New features for expression.scalalang.typed > ------------------------------------------- > > Key: SPARK-20411 > URL: https://issues.apache.org/jira/browse/SPARK-20411 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 2.0.0, 2.0.1, 2.1.0 > Reporter: Loic Descotte > Priority: Minor > > In Spark 2 it is possible to use typed expressions for aggregation methods: > {code} > import org.apache.spark.sql.expressions.scalalang._ > dataset.groupByKey(_.productId).agg(typed.sum[Token](_.score)).toDF("productId", > "sum").orderBy('productId).show > {code} > It seems that only avg, count and sum are defined : > https://spark.apache.org/docs/2.1.0/api/java/org/apache/spark/sql/expressions/scalalang/typed.html > It is very nice to be able to use a typesafe DSL, but it would be good to > have more possibilities, like min and max functions. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org