[ https://issues.apache.org/jira/browse/SPARK-10746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16790428#comment-16790428 ]
Izek Greenfield edited comment on SPARK-10746 at 3/12/19 10:41 AM: ------------------------------------------------------------------- you can implement that by using: {code:scala} import org.apache.spark.sql.functions._ size(collect_set(column).over(window)) {code} was (Author: igreenfi): you can implement that by using: {code:java} // Some comments here size(collect_set(column).over(window)) {code} > count ( distinct columnref) over () returns wrong result set > ------------------------------------------------------------ > > Key: SPARK-10746 > URL: https://issues.apache.org/jira/browse/SPARK-10746 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 1.5.0 > Reporter: N Campbell > Priority: Major > > Same issue as report against HIVE (HIVE-9534) > Result set was expected to contain 5 rows instead of 1 row as others vendors > (ORACLE, Netezza etc) would. > select count( distinct column) over () from t1 -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org