pache-spark-developers-list.1001551.n3.nabble.com/Issue-with-repartition-and-cache-tp10235p19664.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
Hi Dirceu,
Does the issue not show up if you run "map(f =>
f(1).asInstanceOf[Int]).sum" on the "train" RDD? It appears that f(1) is
an String, not an Int. If you're looking to parse and convert it, "toInt"
should be used instead of "asInstanceOf".
-Sandy
On Wed, Jan 21, 2015 at 8:43 AM, Dirceu
Hi Sandy, thanks for the reply.
I tried to run this code without the cache and it worked.
Also if I cache before repartition, it also works, the problem seems to be
something related with repartition and caching.
My train is a SchemaRDD, and if I make all my columns as StringType, the
error doesn'
Hi guys, have anyone find something like this?
I have a training set, and when I repartition it, if I call cache it throw
a classcastexception when I try to execute anything that access it
val rep120 = train.repartition(120)
val cached120 = rep120.cache
cached120.map(f => f(1).asInstanceOf[Int]).s