Hi, I am working with a small dataset about 13Mbyte on the spark-shell. After doing a groupBy on the RDD, I wanted to cache RDD in memory but I keep getting these warnings:
scala> rdd.cache() res28: rdd.type = MappedRDD[63] at repartition at <console>:28 scala> rdd.count() 14/07/19 12:45:18 WARN BlockManager: Block rdd_63_82 could not be dropped from memory as it does not exist 14/07/19 12:45:18 WARN BlockManager: Putting block rdd_63_82 failed 14/07/19 12:45:18 WARN BlockManager: Block rdd_63_40 could not be dropped from memory as it does not exist 14/07/19 12:45:18 WARN BlockManager: Putting block rdd_63_40 failed res29: Long = 5 It seems that I could not cache the data in memory even though my local machine has 16Gb RAM and the data is only 13MB with 100 partitions size. How to prevent this caching issue from happening? Thanks. Rindra -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Caching-issue-with-msg-RDD-block-could-not-be-dropped-from-memory-as-it-does-not-exist-tp10248.html Sent from the Apache Spark User List mailing list archive at Nabble.com.