Re: where storagelevel DISK_ONLY persists RDD to

2015-01-25 Thread Larry Liu
Hi, Charles

Thanks for your reply.

Is it possible to persist RDD to HDFS? What is the default location to
persist RDD with storagelevel DISK_ONLY?

On Sun, Jan 25, 2015 at 6:26 AM, Charles Feduke charles.fed...@gmail.com
wrote:

 I think you want to instead use `.saveAsSequenceFile` to save an RDD to
 someplace like HDFS or NFS it you are attempting to interoperate with
 another system, such as Hadoop. `.persist` is for keeping the contents of
 an RDD around so future uses of that particular RDD don't need to
 recalculate its composite parts.


 On Sun Jan 25 2015 at 3:36:31 AM Larry Liu larryli...@gmail.com wrote:

 I would like to persist RDD TO HDFS or NFS mount. How to change the
 location?




RE: where storagelevel DISK_ONLY persists RDD to

2015-01-25 Thread Shao, Saisai
No, current RDD persistence mechanism do not support putting data on HDFS.

The directory is spark.local.dirs.

Instead you can use checkpoint() to save the RDD on HDFS.

Thanks
Jerry

From: Larry Liu [mailto:larryli...@gmail.com]
Sent: Monday, January 26, 2015 3:08 PM
To: Charles Feduke
Cc: u...@spark.incubator.apache.org
Subject: Re: where storagelevel DISK_ONLY persists RDD to

Hi, Charles

Thanks for your reply.

Is it possible to persist RDD to HDFS? What is the default location to persist 
RDD with storagelevel DISK_ONLY?

On Sun, Jan 25, 2015 at 6:26 AM, Charles Feduke 
charles.fed...@gmail.commailto:charles.fed...@gmail.com wrote:
I think you want to instead use `.saveAsSequenceFile` to save an RDD to 
someplace like HDFS or NFS it you are attempting to interoperate with another 
system, such as Hadoop. `.persist` is for keeping the contents of an RDD around 
so future uses of that particular RDD don't need to recalculate its composite 
parts.

On Sun Jan 25 2015 at 3:36:31 AM Larry Liu 
larryli...@gmail.commailto:larryli...@gmail.com wrote:
I would like to persist RDD TO HDFS or NFS mount. How to change the location?



Re: where storagelevel DISK_ONLY persists RDD to

2015-01-25 Thread Charles Feduke
I think you want to instead use `.saveAsSequenceFile` to save an RDD to
someplace like HDFS or NFS it you are attempting to interoperate with
another system, such as Hadoop. `.persist` is for keeping the contents of
an RDD around so future uses of that particular RDD don't need to
recalculate its composite parts.

On Sun Jan 25 2015 at 3:36:31 AM Larry Liu larryli...@gmail.com wrote:

 I would like to persist RDD TO HDFS or NFS mount. How to change the
 location?