I noticed that my spark jobs suddenly return empty data and tried to find out
why. It seems as if a groupBy operation is the cause of it. When I run
val original:RDD[Data]
val x = original.cache().groupBy(x=>(x.first,x.last,x.date))
and then try
println(s"${x.first()}")
I get an
Exception in thread "main" java.lang.UnsupportedOperationException: empty
collection
original definitely is not empty.
I use Spark 1.2.1 on Mesos
any ideas?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/GroupBy-on-RDD-returns-empty-collection-tp23105.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]