Thanks, Shao. :-)

I am wondering if the spark will rebalance the storage overhead in 
runtime…since still there is some available space on other nodes.


Best,
Yifan LI





> On 06 May 2015, at 14:57, Saisai Shao <sai.sai.s...@gmail.com> wrote:
> 
> I think you could configure multiple disks through spark.local.dir, default 
> is /tmp. Anyway if your intermediate data is larger than available disk 
> space, still will meet this issue.
> 
> spark.local.dir       /tmp    Directory to use for "scratch" space in Spark, 
> including map output files and RDDs that get stored on disk. This should be 
> on a fast, local disk in your system. It can also be a comma-separated list 
> of multiple directories on different disks. NOTE: In Spark 1.0 and later this 
> will be overriden by SPARK_LOCAL_DIRS (Standalone, Mesos) or LOCAL_DIRS 
> (YARN) environment variables set by the cluster manager.
> 
> 2015-05-06 20:35 GMT+08:00 Yifan LI <iamyifa...@gmail.com 
> <mailto:iamyifa...@gmail.com>>:
> Hi,
> 
> I am running my graphx application on Spark, but it failed since there is an 
> error on one executor node(on which available hdfs space is small) that “no 
> space left on device”.
> 
> I can understand why it happened, because my vertex(-attribute) rdd was 
> becoming bigger and bigger during computation…, so maybe sometime the request 
> on that node was too bigger than available space.
> 
> But, is there any way to avoid this kind of error? I am sure that the overall 
> disk space of all nodes is enough for my application.
> 
> Thanks in advance!
> 
> 
> 
> Best,
> Yifan LI
> 
> 
> 
> 
> 
> 

Reply via email to