SPARK_LOCAL_DIRS Issue

2015-02-11 Thread TJ Klein
Hi, Using Spark 1.2 I ran into issued setting SPARK_LOCAL_DIRS to a different path then local directory. On our cluster we have a folder for temporary files (in a central file system), which is called /scratch. When setting SPARK_LOCAL_DIRS=/scratch/node name I get: An error occurred while

Re: SPARK_LOCAL_DIRS Issue

2015-02-11 Thread Charles Feduke
A central location, such as NFS? If they are temporary for the purpose of further job processing you'll want to keep them local to the node in the cluster, i.e., in /tmp. If they are centralized you won't be able to take advantage of data locality and the central file store will become a

Re: SPARK_LOCAL_DIRS Issue

2015-02-11 Thread Tassilo Klein
Thanks for the info. The file system in use is a Lustre file system. Best, Tassilo On Wed, Feb 11, 2015 at 12:15 PM, Charles Feduke charles.fed...@gmail.com wrote: A central location, such as NFS? If they are temporary for the purpose of further job processing you'll want to keep them

Re: SPARK_LOCAL_DIRS Issue

2015-02-11 Thread Charles Feduke
Take a look at this: http://wiki.lustre.org/index.php/Running_Hadoop_with_Lustre Particularly: http://wiki.lustre.org/images/1/1b/Hadoop_wp_v0.4.2.pdf (linked from that article) to get a better idea of what your options are. If its possible to avoid writing to [any] disk I'd recommend that

Re: set SPARK_LOCAL_DIRS issue

2014-08-12 Thread Andrew Ash
// assuming Spark 1.0 Hi Baoqiang, In my experience for the standalone cluster you need to set SPARK_WORKER_DIR not SPARK_LOCAL_DIRS to control where shuffle files are written. I think this is a documentation issue that could be improved, as

set SPARK_LOCAL_DIRS issue

2014-08-09 Thread Baoqiang Cao
Hi I’m trying to using a specific dir for spark working directory since I have limited space at /tmp. I tried: 1) export SPARK_LOCAL_DIRS=“/mnt/data/tmp” or 2) SPARK_LOCAL_DIRS=“/mnt/data/tmp” in spark-env.sh But neither worked, since the output of spark still saying ERROR