Hi,
Using Spark 1.2 I ran into issued setting SPARK_LOCAL_DIRS to a different
path then local directory.
On our cluster we have a folder for temporary files (in a central file
system), which is called /scratch.
When setting SPARK_LOCAL_DIRS=/scratch/node name
I get:
An error occurred while
A central location, such as NFS?
If they are temporary for the purpose of further job processing you'll want
to keep them local to the node in the cluster, i.e., in /tmp. If they are
centralized you won't be able to take advantage of data locality and the
central file store will become a
Thanks for the info. The file system in use is a Lustre file system.
Best,
Tassilo
On Wed, Feb 11, 2015 at 12:15 PM, Charles Feduke charles.fed...@gmail.com
wrote:
A central location, such as NFS?
If they are temporary for the purpose of further job processing you'll
want to keep them
Take a look at this:
http://wiki.lustre.org/index.php/Running_Hadoop_with_Lustre
Particularly: http://wiki.lustre.org/images/1/1b/Hadoop_wp_v0.4.2.pdf
(linked from that article)
to get a better idea of what your options are.
If its possible to avoid writing to [any] disk I'd recommend that
// assuming Spark 1.0
Hi Baoqiang,
In my experience for the standalone cluster you need to set
SPARK_WORKER_DIR not SPARK_LOCAL_DIRS to control where shuffle files are
written. I think this is a documentation issue that could be improved, as
Hi
I’m trying to using a specific dir for spark working directory since I have
limited space at /tmp. I tried:
1)
export SPARK_LOCAL_DIRS=“/mnt/data/tmp”
or 2)
SPARK_LOCAL_DIRS=“/mnt/data/tmp” in spark-env.sh
But neither worked, since the output of spark still saying
ERROR