I noticed that my spark jobs suddenly return empty data and tried to find out why. It seems as if a groupBy operation is the cause of it. When I run
val original:RDD[Data] val x = original.cache().groupBy(x=>(x.first,x.last,x.date)) and then try println(s"${x.first()}") I get an Exception in thread "main" java.lang.UnsupportedOperationException: empty collection original definitely is not empty. I use Spark 1.2.1 on Mesos any ideas? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/GroupBy-on-RDD-returns-empty-collection-tp23105.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org