Hi, All, When we run Spark on very large data, spark will do shuffle and the shuffle data will write to local disk. Because we have limited capacity at local disk, the shuffled data will occupied all of the local disk and then will be failed. So is there a way we can write the shuffle spill data to HDFS? Or if we introduce alluxio in our system, can the shuffled data write to alluxio?
Thanks and Regards, 阎志涛(Tony) 北京腾云天下科技有限公司 -------------------------------------------------------------------------------------------------------- 邮箱:tony....@tendcloud.com 电话:13911815695 微信: zhitao_yan QQ : 4707059 地址:北京市东城区东直门外大街39号院2号楼航空服务大厦602室 邮编:100027 -------------------------------------------------------------------------------------------------------- TalkingData.com - 让数据说话