[ 
https://issues.apache.org/jira/browse/SPARK-20411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15992517#comment-15992517
 ] 

Jason Moore commented on SPARK-20411:
-------------------------------------

And, ideally, anything else within org.apache.spark.sql.functions (e.g. 
countDistinct).  We're looking to replace our use of DataFrames with Datasets, 
which means finding a replacement for all the aggregation functions that we 
use.  If I end up putting together some functions myself, I'll pop back here to 
contribute them.

> New features for expression.scalalang.typed
> -------------------------------------------
>
>                 Key: SPARK-20411
>                 URL: https://issues.apache.org/jira/browse/SPARK-20411
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.0.0, 2.0.1, 2.1.0
>            Reporter: Loic Descotte
>            Priority: Minor
>
> In Spark 2 it is possible to use typed expressions for aggregation methods: 
> {code}
> import org.apache.spark.sql.expressions.scalalang._ 
> dataset.groupByKey(_.productId).agg(typed.sum[Token](_.score)).toDF("productId",
>  "sum").orderBy('productId).show
> {code}
> It seems that only avg, count and sum are defined : 
> https://spark.apache.org/docs/2.1.0/api/java/org/apache/spark/sql/expressions/scalalang/typed.html
> It is very nice to be able to use a typesafe DSL, but it would be good to 
> have more possibilities, like min and max functions.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to