Hi,
I noticed that rdd.cache() is not happening immediately rather due to lazy
feature of Spark, it is happening just at the moment you perform some
map/reduce actions. Is this true?
If this is the case, how can I enforce Spark to cache immediately at its
cache() statement? I need this to
On Wed, Dec 3, 2014 at 10:52 AM, shahab shahab.mok...@gmail.com wrote:
Hi,
I noticed that rdd.cache() is not happening immediately rather due to lazy
feature of Spark, it is happening just at the moment you perform some
map/reduce actions. Is this true?
Yes, this is correct.
If this is
Yes,
otherwise you can try:
rdd.cache().count()
and then run your benchmark
Paolo
Da: Daniel Darabosmailto:daniel.dara...@lynxanalytics.com
Data invio: ?mercoled?? ?3? ?dicembre? ?2014 ?12?:?28
A: shahabmailto:shahab.mok...@gmail.com
Cc: user@spark.apache.orgmailto:user@spark.apache.org
On
Daniel and Paolo, thanks for the comments.
best,
/Shahab
On Wed, Dec 3, 2014 at 3:12 PM, Paolo Platter paolo.plat...@agilelab.it
wrote:
Yes,
otherwise you can try:
rdd.cache().count()
and then run your benchmark
Paolo
*Da:* Daniel Darabos daniel.dara...@lynxanalytics.com
of
the RDD, rather than iterating.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/How-to-enforce-RDD-to-be-cached-tp20230p20284.html
Sent from the Apache Spark User List mailing list archive at Nabble.com