Can we redirect Spark shuffle spill data to HDFS or Alluxio?

tony....@tendcloud.com Wed, 24 Aug 2016 05:12:07 -0700

Hi, All,
When we run Spark on very large data, spark will do shuffle and the shuffle 
data will write to local disk. Because we have limited capacity at local disk, 
the shuffled data will occupied all of the local disk and then will be failed.  
So is there a way we can write the shuffle spill data to HDFS? Or if we 
introduce alluxio in our system, can the shuffled data write to alluxio?


Thanks and Regards,



阎志涛(Tony)

北京腾云天下科技有限公司
--------------------------------------------------------------------------------------------------------
邮箱：tony....@tendcloud.com
电话：13911815695
微信： zhitao_yan
QQ ： 4707059
地址：北京市东城区东直门外大街39号院2号楼航空服务大厦602室
邮编：100027
--------------------------------------------------------------------------------------------------------
TalkingData.com - 让数据说话

Can we redirect Spark shuffle spill data to HDFS or Alluxio?

Reply via email to