You can use *SparkContext.checkpointFile(<path to the dir containing RDD
checkpoint>)*. However note that the checkpoint file contains Java
serialized data. So if your data types change in between writing and
reading of the checkpoint file for whatever reason (Spark version change,
your code was recompiled, etc.), you may not be able to read from the
checkpoint. So use carefully :)




On Thu, May 18, 2017 at 12:18 AM, Neelesh Sambhajiche <
sambhajicheneel...@gmail.com> wrote:

> That is exactly what we are currently doing - storing it in a csv file.
> However, as checkpointing permanently writes to disk, if we use
> checkpointing along with saving the RDD to a text file, the data gets
> stored twice on the disk. That is why I was looking for a way to read the
> checkpointed data in a different program.
>
> On Wed, May 17, 2017 at 12:59 PM, Tathagata Das <
> tathagata.das1...@gmail.com> wrote:
>
>> Why not just save the RDD to a proper file? text file, sequence, file,
>> many options. Then its standard to read it back in different program.
>>
>> On Wed, May 17, 2017 at 12:01 AM, neelesh.sa <
>> sambhajicheneel...@gmail.com> wrote:
>>
>>> Is it possible to checkpoint a RDD in one run of my application and use
>>> the
>>> saved RDD in the next run of my application?
>>>
>>> For example, with the following code:
>>> val x = List(1,2,3,4)
>>> val y = sc.parallelize(x ,2).map( c => c*2)
>>> y.checkpoint
>>> y.count
>>>
>>> Is it possible to read the checkpointed RDD in another application?
>>>
>>>
>>>
>>>
>>>
>>> --
>>> View this message in context: http://apache-spark-user-list.
>>> 1001560.n3.nabble.com/checkpointing-without-streaming-tp4541p28691.html
>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>>
>>>
>>
>
>
> --
>
>
> *Regards,Neelesh SambhajicheMobile: 8058437181 <(805)%20843-7181>*
>
> [image: Inline image 1]
> *Birla Institute of Technology & Science,* Pilani
> Pilani Campus, Rajasthan 333 031, INDIA
>

Reply via email to