Count distinct with groupBy usage

2014-07-15 Thread buntu
.1001560.n3.nabble.com/Count-distinct-with-groupBy-usage-tp9781.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Count distinct with groupBy usage

2014-07-15 Thread Nick Pentreath
count distinct on userId and also apply another groupBy on timestamp field. Please let me know how to handle such cases. Thanks! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Count-distinct-with-groupBy-usage-tp9781.html Sent from the Apache Spark User

Re: Count distinct with groupBy usage

2014-07-15 Thread Zongheng Yang
such cases. Thanks! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Count-distinct-with-groupBy-usage-tp9781.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Count distinct with groupBy usage

2014-07-15 Thread buntu
-spark-user-list.1001560.n3.nabble.com/Count-distinct-with-groupBy-usage-tp9781p9787.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Count distinct with groupBy usage

2014-07-15 Thread buntu
Thanks Nick. All I'm attempting is to report number of unique visitors per page by date. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Count-distinct-with-groupBy-usage-tp9781p9786.html Sent from the Apache Spark User List mailing list archive

Re: Count distinct with groupBy usage

2014-07-15 Thread Sean Owen
csv.groupBy(_(1)).count But not able to see how to do count distinct on userId and also apply another groupBy on timestamp field. Please let me know how to handle such cases. Thanks! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Count-distinct-with-groupBy

Re: Count distinct with groupBy usage

2014-07-15 Thread buntu
Thanks Sean!! Thats what I was looking for -- group by on mulitple fields. I'm gonna play with it now. Thanks again! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Count-distinct-with-groupBy-usage-tp9781p9803.html Sent from the Apache Spark User List