spark.local.dir and spark.worker.dir not used

2014-09-23 Thread Priya Ch
Hi,

I am using spark 1.0.0. In my spark code i m trying to persist an rdd to
disk as rrd.persist(DISK_ONLY). But unfortunately couldn't find the
location where the rdd has been written to disk. I specified
SPARK_LOCAL_DIRS and SPARK_WORKER_DIR to some other location rather than
using the default /tmp directory, but still couldnt see anything in worker
directory andspark ocal directory.

I also tried specifying the local dir and worker dir from the spark code
while defining the SparkConf as conf.set(spark.local.dir,
/home/padma/sparkdir) but the directories are not used.


In general which directories spark would be using for map output files,
intermediate writes and persisting rdd to disk ?


Thanks,
Padma Ch


RE: spark.local.dir and spark.worker.dir not used

2014-09-23 Thread Shao, Saisai
Hi,

Spark.local.dir is the one used to write map output data and persistent RDD 
blocks, but the path of  file has been hashed, so you cannot directly find the 
persistent rdd block files, but definitely it will be in this folders on your 
worker node.

Thanks
Jerry

From: Priya Ch [mailto:learnings.chitt...@gmail.com]
Sent: Tuesday, September 23, 2014 6:31 PM
To: user@spark.apache.org; d...@spark.apache.org
Subject: spark.local.dir and spark.worker.dir not used

Hi,

I am using spark 1.0.0. In my spark code i m trying to persist an rdd to disk 
as rrd.persist(DISK_ONLY). But unfortunately couldn't find the location where 
the rdd has been written to disk. I specified SPARK_LOCAL_DIRS and 
SPARK_WORKER_DIR to some other location rather than using the default /tmp 
directory, but still couldnt see anything in worker directory andspark ocal 
directory.

I also tried specifying the local dir and worker dir from the spark code while 
defining the SparkConf as conf.set(spark.local.dir, /home/padma/sparkdir) 
but the directories are not used.


In general which directories spark would be using for map output files, 
intermediate writes and persisting rdd to disk ?


Thanks,
Padma Ch


Re: spark.local.dir and spark.worker.dir not used

2014-09-23 Thread Chitturi Padma
Is it possible to view the persisted RDD blocks ?

If I use YARN, RDD blocks would be persisted to hdfs then will i be able to
read the hdfs blocks as i could do in hadoop ?

On Tue, Sep 23, 2014 at 5:56 PM, Shao, Saisai [via Apache Spark User List] 
ml-node+s1001560n14885...@n3.nabble.com wrote:

  Hi,



 Spark.local.dir is the one used to write map output data and persistent
 RDD blocks, but the path of  file has been hashed, so you cannot directly
 find the persistent rdd block files, but definitely it will be in this
 folders on your worker node.



 Thanks

 Jerry



 *From:* Priya Ch [mailto:[hidden email]
 http://user/SendEmail.jtp?type=nodenode=14885i=0]
 *Sent:* Tuesday, September 23, 2014 6:31 PM
 *To:* [hidden email] http://user/SendEmail.jtp?type=nodenode=14885i=1;
 [hidden email] http://user/SendEmail.jtp?type=nodenode=14885i=2
 *Subject:* spark.local.dir and spark.worker.dir not used



 Hi,



 I am using spark 1.0.0. In my spark code i m trying to persist an rdd to
 disk as rrd.persist(DISK_ONLY). But unfortunately couldn't find the
 location where the rdd has been written to disk. I specified
 SPARK_LOCAL_DIRS and SPARK_WORKER_DIR to some other location rather than
 using the default /tmp directory, but still couldnt see anything in worker
 directory andspark ocal directory.



 I also tried specifying the local dir and worker dir from the spark code
 while defining the SparkConf as conf.set(spark.local.dir,
 /home/padma/sparkdir) but the directories are not used.





 In general which directories spark would be using for map output files,
 intermediate writes and persisting rdd to disk ?





 Thanks,

 Padma Ch


 --
  If you reply to this email, your message will be added to the discussion
 below:

 http://apache-spark-user-list.1001560.n3.nabble.com/spark-local-dir-and-spark-worker-dir-not-used-tp14881p14885.html
  To start a new topic under Apache Spark User List, email
 ml-node+s1001560n1...@n3.nabble.com
 To unsubscribe from Apache Spark User List, click here
 http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=1code=bGVhcm5pbmdzLmNoaXR0dXJpQGdtYWlsLmNvbXwxfC03NzExMjUwMg==
 .
 NAML
 http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/spark-local-dir-and-spark-worker-dir-not-used-tp14881p14886.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: spark.local.dir and spark.worker.dir not used

2014-09-23 Thread Chitturi Padma
I couldnt even see the spark-id folder in the default /tmp directory of
local.dir.

On Tue, Sep 23, 2014 at 6:01 PM, Priya Ch learnings.chitt...@gmail.com
wrote:

 Is it possible to view the persisted RDD blocks ?

 If I use YARN, RDD blocks would be persisted to hdfs then will i be able
 to read the hdfs blocks as i could do in hadoop ?

 On Tue, Sep 23, 2014 at 5:56 PM, Shao, Saisai [via Apache Spark User List]
 ml-node+s1001560n14885...@n3.nabble.com wrote:

  Hi,



 Spark.local.dir is the one used to write map output data and persistent
 RDD blocks, but the path of  file has been hashed, so you cannot directly
 find the persistent rdd block files, but definitely it will be in this
 folders on your worker node.



 Thanks

 Jerry



 *From:* Priya Ch [mailto:[hidden email]
 http://user/SendEmail.jtp?type=nodenode=14885i=0]
 *Sent:* Tuesday, September 23, 2014 6:31 PM
 *To:* [hidden email] http://user/SendEmail.jtp?type=nodenode=14885i=1;
 [hidden email] http://user/SendEmail.jtp?type=nodenode=14885i=2
 *Subject:* spark.local.dir and spark.worker.dir not used



 Hi,



 I am using spark 1.0.0. In my spark code i m trying to persist an rdd to
 disk as rrd.persist(DISK_ONLY). But unfortunately couldn't find the
 location where the rdd has been written to disk. I specified
 SPARK_LOCAL_DIRS and SPARK_WORKER_DIR to some other location rather than
 using the default /tmp directory, but still couldnt see anything in worker
 directory andspark ocal directory.



 I also tried specifying the local dir and worker dir from the spark code
 while defining the SparkConf as conf.set(spark.local.dir,
 /home/padma/sparkdir) but the directories are not used.





 In general which directories spark would be using for map output files,
 intermediate writes and persisting rdd to disk ?





 Thanks,

 Padma Ch


 --
  If you reply to this email, your message will be added to the
 discussion below:

 http://apache-spark-user-list.1001560.n3.nabble.com/spark-local-dir-and-spark-worker-dir-not-used-tp14881p14885.html
  To start a new topic under Apache Spark User List, email
 ml-node+s1001560n1...@n3.nabble.com
 To unsubscribe from Apache Spark User List, click here
 http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=1code=bGVhcm5pbmdzLmNoaXR0dXJpQGdtYWlsLmNvbXwxfC03NzExMjUwMg==
 .
 NAML
 http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml







--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/spark-local-dir-and-spark-worker-dir-not-used-tp14881p14887.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

RE: spark.local.dir and spark.worker.dir not used

2014-09-23 Thread Shao, Saisai
This folder will be created when you start your Spark application under your 
spark.local.dir, with the name “spark-local-xxx” as prefix. It’s quite strange 
you don’t see this folder, maybe you miss something. Besides if Spark cannot 
create this folder on start, persist rdd to disk will be failed.

Also I think there’s no way to persist RDD to HDFS, even in YARN, only RDD’s 
checkpoint can save data on HDFS.

Thanks
Jerry

From: Chitturi Padma [mailto:learnings.chitt...@gmail.com]
Sent: Tuesday, September 23, 2014 8:33 PM
To: u...@spark.incubator.apache.org
Subject: Re: spark.local.dir and spark.worker.dir not used

I couldnt even see the spark-id folder in the default /tmp directory of 
local.dir.

On Tue, Sep 23, 2014 at 6:01 PM, Priya Ch [hidden 
email]/user/SendEmail.jtp?type=nodenode=14887i=0 wrote:
Is it possible to view the persisted RDD blocks ?
If I use YARN, RDD blocks would be persisted to hdfs then will i be able to 
read the hdfs blocks as i could do in hadoop ?

On Tue, Sep 23, 2014 at 5:56 PM, Shao, Saisai [via Apache Spark User List] 
[hidden email]/user/SendEmail.jtp?type=nodenode=14887i=1 wrote:
Hi,

Spark.local.dir is the one used to write map output data and persistent RDD 
blocks, but the path of  file has been hashed, so you cannot directly find the 
persistent rdd block files, but definitely it will be in this folders on your 
worker node.

Thanks
Jerry

From: Priya Ch [mailto:[hidden 
email]http://user/SendEmail.jtp?type=nodenode=14885i=0]
Sent: Tuesday, September 23, 2014 6:31 PM
To: [hidden email]http://user/SendEmail.jtp?type=nodenode=14885i=1; [hidden 
email]http://user/SendEmail.jtp?type=nodenode=14885i=2
Subject: spark.local.dir and spark.worker.dir not used

Hi,

I am using spark 1.0.0. In my spark code i m trying to persist an rdd to disk 
as rrd.persist(DISK_ONLY). But unfortunately couldn't find the location where 
the rdd has been written to disk. I specified SPARK_LOCAL_DIRS and 
SPARK_WORKER_DIR to some other location rather than using the default /tmp 
directory, but still couldnt see anything in worker directory andspark ocal 
directory.

I also tried specifying the local dir and worker dir from the spark code while 
defining the SparkConf as conf.set(spark.local.dir, /home/padma/sparkdir) 
but the directories are not used.


In general which directories spark would be using for map output files, 
intermediate writes and persisting rdd to disk ?


Thanks,
Padma Ch


If you reply to this email, your message will be added to the discussion below:
http://apache-spark-user-list.1001560.n3.nabble.com/spark-local-dir-and-spark-worker-dir-not-used-tp14881p14885.html
To start a new topic under Apache Spark User List, email [hidden 
email]/user/SendEmail.jtp?type=nodenode=14887i=2
To unsubscribe from Apache Spark User List, click here.
NAMLhttp://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml




View this message in context: Re: spark.local.dir and spark.worker.dir not 
usedhttp://apache-spark-user-list.1001560.n3.nabble.com/spark-local-dir-and-spark-worker-dir-not-used-tp14881p14887.html
Sent from the Apache Spark User List mailing list 
archivehttp://apache-spark-user-list.1001560.n3.nabble.com/ at Nabble.com.