Hi,

I am working with a small dataset about 13Mbyte on the spark-shell. After
doing a
groupBy on the RDD, I wanted to cache RDD in memory but I keep getting
these warnings:

scala> rdd.cache()
res28: rdd.type = MappedRDD[63] at repartition at <console>:28


scala> rdd.count()
14/07/19 12:45:18 WARN BlockManager: Block rdd_63_82 could not be dropped
from memory as it does not exist
14/07/19 12:45:18 WARN BlockManager: Putting block rdd_63_82 failed
14/07/19 12:45:18 WARN BlockManager: Block rdd_63_40 could not be dropped
from memory as it does not exist
14/07/19 12:45:18 WARN BlockManager: Putting block rdd_63_40 failed
res29: Long = 5

It seems that I could not cache the data in memory even though my local
machine has
16Gb RAM and the data is only 13MB with 100 partitions size.

How to prevent this caching issue from happening? Thanks.

Rindra



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Caching-issue-with-msg-RDD-block-could-not-be-dropped-from-memory-as-it-does-not-exist-tp10248.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply via email to