How to enforce RDD to be cached?

2014-12-03 Thread shahab
Hi, I noticed that rdd.cache() is not happening immediately rather due to lazy feature of Spark, it is happening just at the moment you perform some map/reduce actions. Is this true? If this is the case, how can I enforce Spark to cache immediately at its cache() statement? I need this to

Re: How to enforce RDD to be cached?

2014-12-03 Thread Daniel Darabos
On Wed, Dec 3, 2014 at 10:52 AM, shahab shahab.mok...@gmail.com wrote: Hi, I noticed that rdd.cache() is not happening immediately rather due to lazy feature of Spark, it is happening just at the moment you perform some map/reduce actions. Is this true? Yes, this is correct. If this is

Re: How to enforce RDD to be cached?

2014-12-03 Thread Paolo Platter
Yes, otherwise you can try: rdd.cache().count() and then run your benchmark Paolo Da: Daniel Darabosmailto:daniel.dara...@lynxanalytics.com Data invio: ?mercoled?? ?3? ?dicembre? ?2014 ?12?:?28 A: shahabmailto:shahab.mok...@gmail.com Cc: user@spark.apache.orgmailto:user@spark.apache.org On

Re: How to enforce RDD to be cached?

2014-12-03 Thread shahab
Daniel and Paolo, thanks for the comments. best, /Shahab On Wed, Dec 3, 2014 at 3:12 PM, Paolo Platter paolo.plat...@agilelab.it wrote: Yes, otherwise you can try: rdd.cache().count() and then run your benchmark Paolo *Da:* Daniel Darabos daniel.dara...@lynxanalytics.com

Re: How to enforce RDD to be cached?

2014-12-03 Thread dsiegel
of the RDD, rather than iterating. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-enforce-RDD-to-be-cached-tp20230p20284.html Sent from the Apache Spark User List mailing list archive at Nabble.com