Re: Can we redirect Spark shuffle spill data to HDFS or Alluxio?

2016-08-24 Thread Sun Rui
Before 2.0, Spark has built-in support for caching RDD data on 
Tachyon(Alluxio), but that support is removed since 2.0. In either case, Spark 
does not support writing shuffle data to Tachyon.

Since Alluxio has experimental support for FUSE 
(http://www.alluxio.org/docs/master/en/Mounting-Alluxio-FS-with-FUSE.html 
<http://www.alluxio.org/docs/master/en/Mounting-Alluxio-FS-with-FUSE.html>), 
you can try it and set spark.local.dir to point to the directory of Alluxio 
FUSE.

There is also on-going effort trying to take advantage of SSD to improve 
shuffle performance, see https://issues.apache.org/jira/browse/SPARK-12196 
<https://issues.apache.org/jira/browse/SPARK-12196>. The PR is ready, but not 
get merged. You may give it a try by yourself.

> On Aug 24, 2016, at 22:30, tony@tendcloud.com wrote:
> 
> Hi, Saisai and Rui,
> Thanks a lot for your answer.  Alluxio tried to work as the middle layer 
> between storage and Spark, so is it possible to use Alluxio to resolve the 
> issue? We want to have 1 SSD for every datanode and use Alluxio to manage 
> mem,ssd and hdd. 
> 
> Thanks and Regards,
> Tony
> 
> tony@tendcloud.com <mailto:tony@tendcloud.com>
>  
> From: Sun Rui <mailto:sunrise_...@163.com>
> Date: 2016-08-24 22:17
> To: Saisai Shao <mailto:sai.sai.s...@gmail.com>
> CC: tony@tendcloud.com <mailto:tony@tendcloud.com>; user 
> <mailto:user@spark.apache.org>
> Subject: Re: Can we redirect Spark shuffle spill data to HDFS or Alluxio?
> Yes, I also tried FUSE before, it is not stable and I don’t recommend it
>> On Aug 24, 2016, at 22:15, Saisai Shao <sai.sai.s...@gmail.com 
>> <mailto:sai.sai.s...@gmail.com>> wrote:
>> 
>> Also fuse is another candidate (https://wiki.apache.org/hadoop/MountableHDFS 
>> <https://wiki.apache.org/hadoop/MountableHDFS>), but not so stable as I 
>> tried before.
>> 
>> On Wed, Aug 24, 2016 at 10:09 PM, Sun Rui <sunrise_...@163.com 
>> <mailto:sunrise_...@163.com>> wrote:
>> For HDFS, maybe you can try mount HDFS as NFS. But not sure about the 
>> stability, and also there is additional overhead of network I/O and replica 
>> of HDFS files.
>> 
>>> On Aug 24, 2016, at 21:02, Saisai Shao <sai.sai.s...@gmail.com 
>>> <mailto:sai.sai.s...@gmail.com>> wrote:
>>> 
>>> Spark Shuffle uses Java File related API to create local dirs and R/W data, 
>>> so it can only be worked with OS supported FS. It doesn't leverage Hadoop 
>>> FileSystem API, so writing to Hadoop compatible FS is not worked.
>>> 
>>> Also it is not suitable to write temporary shuffle data into distributed 
>>> FS, this will bring unnecessary overhead. In you case if you have large 
>>> memory on each node, you could use ramfs instead to store shuffle data.
>>> 
>>> Thanks
>>> Saisai
>>> 
>>> On Wed, Aug 24, 2016 at 8:11 PM, tony@tendcloud.com 
>>> <mailto:tony@tendcloud.com> <tony@tendcloud.com 
>>> <mailto:tony@tendcloud.com>> wrote:
>>> Hi, All,
>>> When we run Spark on very large data, spark will do shuffle and the shuffle 
>>> data will write to local disk. Because we have limited capacity at local 
>>> disk, the shuffled data will occupied all of the local disk and then will 
>>> be failed.  So is there a way we can write the shuffle spill data to HDFS? 
>>> Or if we introduce alluxio in our system, can the shuffled data write to 
>>> alluxio?
>>> 
>>> Thanks and Regards,
>>> 
>>> 阎志涛(Tony)
>>> 
>>> 北京腾云天下科技有限公司
>>> 
>>> 邮箱:tony@tendcloud.com <mailto:tony@tendcloud.com>
>>> 电话:13911815695
>>> 微信: zhitao_yan
>>> QQ : 4707059
>>> 地址:北京市东城区东直门外大街39号院2号楼航空服务大厦602室
>>> 邮编:100027
>>> 
>>> TalkingData.com <http://talkingdata.com/> - 让数据说话



Re: Re: Can we redirect Spark shuffle spill data to HDFS or Alluxio?

2016-08-24 Thread Saisai Shao
Based on my limited knowledge of Tachyon (Alluxio), it only provides a
layer of Hadoop compatible FileSystem API, which means it cannot be used in
shuffle data store. If it can be mounted as an OS supported FS layer, like
NFS or Fuse, then it can be used for shuffle data store.

But never neglect the overhead and stability of distributed FS (RPC
communication, network latency), since shuffle is quite critical.

On Wed, Aug 24, 2016 at 10:30 PM, tony@tendcloud.com <
tony@tendcloud.com> wrote:

> Hi, Saisai and Rui,
> Thanks a lot for your answer.  Alluxio tried to work as the middle layer
> between storage and Spark, so is it possible to use Alluxio to resolve the
> issue? We want to have 1 SSD for every datanode and use Alluxio to manage
> mem,ssd and hdd.
>
> Thanks and Regards,
> Tony
>
> --
> tony@tendcloud.com
>
>
> *From:* Sun Rui <sunrise_...@163.com>
> *Date:* 2016-08-24 22:17
> *To:* Saisai Shao <sai.sai.s...@gmail.com>
> *CC:* tony....@tendcloud.com; user <user@spark.apache.org>
> *Subject:* Re: Can we redirect Spark shuffle spill data to HDFS or
> Alluxio?
> Yes, I also tried FUSE before, it is not stable and I don’t recommend it
>
> On Aug 24, 2016, at 22:15, Saisai Shao <sai.sai.s...@gmail.com> wrote:
>
> Also fuse is another candidate (https://wiki.apache.org/
> hadoop/MountableHDFS), but not so stable as I tried before.
>
> On Wed, Aug 24, 2016 at 10:09 PM, Sun Rui <sunrise_...@163.com> wrote:
>
>> For HDFS, maybe you can try mount HDFS as NFS. But not sure about the
>> stability, and also there is additional overhead of network I/O and replica
>> of HDFS files.
>>
>> On Aug 24, 2016, at 21:02, Saisai Shao <sai.sai.s...@gmail.com> wrote:
>>
>> Spark Shuffle uses Java File related API to create local dirs and R/W
>> data, so it can only be worked with OS supported FS. It doesn't leverage
>> Hadoop FileSystem API, so writing to Hadoop compatible FS is not worked.
>>
>> Also it is not suitable to write temporary shuffle data into distributed
>> FS, this will bring unnecessary overhead. In you case if you have large
>> memory on each node, you could use ramfs instead to store shuffle data.
>>
>> Thanks
>> Saisai
>>
>> On Wed, Aug 24, 2016 at 8:11 PM, tony@tendcloud.com <
>> tony@tendcloud.com> wrote:
>>
>>> Hi, All,
>>> When we run Spark on very large data, spark will do shuffle and the
>>> shuffle data will write to local disk. Because we have limited capacity at
>>> local disk, the shuffled data will occupied all of the local disk and then
>>> will be failed.  So is there a way we can write the shuffle spill data to
>>> HDFS? Or if we introduce alluxio in our system, can the shuffled data write
>>> to alluxio?
>>>
>>> Thanks and Regards,
>>>
>>> --
>>> 阎志涛(Tony)
>>>
>>> 北京腾云天下科技有限公司
>>> 
>>> 
>>> 邮箱:tony@tendcloud.com
>>> 电话:13911815695
>>> 微信: zhitao_yan
>>> QQ : 4707059
>>> 地址:北京市东城区东直门外大街39号院2号楼航空服务大厦602室
>>> 邮编:100027
>>> 
>>> 
>>> TalkingData.com <http://talkingdata.com/> - 让数据说话
>>>
>>
>>
>>
>
>


Re: Re: Can we redirect Spark shuffle spill data to HDFS or Alluxio?

2016-08-24 Thread tony....@tendcloud.com
Hi, Saisai and Rui,
Thanks a lot for your answer.  Alluxio tried to work as the middle layer 
between storage and Spark, so is it possible to use Alluxio to resolve the 
issue? We want to have 1 SSD for every datanode and use Alluxio to manage 
mem,ssd and hdd. 

Thanks and Regards,
Tony



tony@tendcloud.com
 
From: Sun Rui
Date: 2016-08-24 22:17
To: Saisai Shao
CC: tony@tendcloud.com; user
Subject: Re: Can we redirect Spark shuffle spill data to HDFS or Alluxio?
Yes, I also tried FUSE before, it is not stable and I don’t recommend it
On Aug 24, 2016, at 22:15, Saisai Shao <sai.sai.s...@gmail.com> wrote:

Also fuse is another candidate (https://wiki.apache.org/hadoop/MountableHDFS), 
but not so stable as I tried before.

On Wed, Aug 24, 2016 at 10:09 PM, Sun Rui <sunrise_...@163.com> wrote:
For HDFS, maybe you can try mount HDFS as NFS. But not sure about the 
stability, and also there is additional overhead of network I/O and replica of 
HDFS files.

On Aug 24, 2016, at 21:02, Saisai Shao <sai.sai.s...@gmail.com> wrote:

Spark Shuffle uses Java File related API to create local dirs and R/W data, so 
it can only be worked with OS supported FS. It doesn't leverage Hadoop 
FileSystem API, so writing to Hadoop compatible FS is not worked.

Also it is not suitable to write temporary shuffle data into distributed FS, 
this will bring unnecessary overhead. In you case if you have large memory on 
each node, you could use ramfs instead to store shuffle data.

Thanks
Saisai

On Wed, Aug 24, 2016 at 8:11 PM, tony@tendcloud.com 
<tony@tendcloud.com> wrote:
Hi, All,
When we run Spark on very large data, spark will do shuffle and the shuffle 
data will write to local disk. Because we have limited capacity at local disk, 
the shuffled data will occupied all of the local disk and then will be failed.  
So is there a way we can write the shuffle spill data to HDFS? Or if we 
introduce alluxio in our system, can the shuffled data write to alluxio?

Thanks and Regards,



阎志涛(Tony)

北京腾云天下科技有限公司

邮箱:tony@tendcloud.com
电话:13911815695
微信: zhitao_yan
QQ : 4707059
地址:北京市东城区东直门外大街39号院2号楼航空服务大厦602室
邮编:100027

TalkingData.com - 让数据说话






Re: Can we redirect Spark shuffle spill data to HDFS or Alluxio?

2016-08-24 Thread Sun Rui
Yes, I also tried FUSE before, it is not stable and I don’t recommend it
> On Aug 24, 2016, at 22:15, Saisai Shao  wrote:
> 
> Also fuse is another candidate (https://wiki.apache.org/hadoop/MountableHDFS 
> ), but not so stable as I tried 
> before.
> 
> On Wed, Aug 24, 2016 at 10:09 PM, Sun Rui  > wrote:
> For HDFS, maybe you can try mount HDFS as NFS. But not sure about the 
> stability, and also there is additional overhead of network I/O and replica 
> of HDFS files.
> 
>> On Aug 24, 2016, at 21:02, Saisai Shao > > wrote:
>> 
>> Spark Shuffle uses Java File related API to create local dirs and R/W data, 
>> so it can only be worked with OS supported FS. It doesn't leverage Hadoop 
>> FileSystem API, so writing to Hadoop compatible FS is not worked.
>> 
>> Also it is not suitable to write temporary shuffle data into distributed FS, 
>> this will bring unnecessary overhead. In you case if you have large memory 
>> on each node, you could use ramfs instead to store shuffle data.
>> 
>> Thanks
>> Saisai
>> 
>> On Wed, Aug 24, 2016 at 8:11 PM, tony@tendcloud.com 
>>  > > wrote:
>> Hi, All,
>> When we run Spark on very large data, spark will do shuffle and the shuffle 
>> data will write to local disk. Because we have limited capacity at local 
>> disk, the shuffled data will occupied all of the local disk and then will be 
>> failed.  So is there a way we can write the shuffle spill data to HDFS? Or 
>> if we introduce alluxio in our system, can the shuffled data write to 
>> alluxio?
>> 
>> Thanks and Regards,
>> 
>> 阎志涛(Tony)
>> 
>> 北京腾云天下科技有限公司
>> 
>> 邮箱:tony@tendcloud.com 
>> 电话:13911815695
>> 微信: zhitao_yan
>> QQ : 4707059
>> 地址:北京市东城区东直门外大街39号院2号楼航空服务大厦602室
>> 邮编:100027
>> 
>> TalkingData.com  - 让数据说话
>> 
> 
> 



Re: Can we redirect Spark shuffle spill data to HDFS or Alluxio?

2016-08-24 Thread Saisai Shao
Also fuse is another candidate (https://wiki.apache.org/hadoop/MountableHDFS),
but not so stable as I tried before.

On Wed, Aug 24, 2016 at 10:09 PM, Sun Rui  wrote:

> For HDFS, maybe you can try mount HDFS as NFS. But not sure about the
> stability, and also there is additional overhead of network I/O and replica
> of HDFS files.
>
> On Aug 24, 2016, at 21:02, Saisai Shao  wrote:
>
> Spark Shuffle uses Java File related API to create local dirs and R/W
> data, so it can only be worked with OS supported FS. It doesn't leverage
> Hadoop FileSystem API, so writing to Hadoop compatible FS is not worked.
>
> Also it is not suitable to write temporary shuffle data into distributed
> FS, this will bring unnecessary overhead. In you case if you have large
> memory on each node, you could use ramfs instead to store shuffle data.
>
> Thanks
> Saisai
>
> On Wed, Aug 24, 2016 at 8:11 PM, tony@tendcloud.com <
> tony@tendcloud.com> wrote:
>
>> Hi, All,
>> When we run Spark on very large data, spark will do shuffle and the
>> shuffle data will write to local disk. Because we have limited capacity at
>> local disk, the shuffled data will occupied all of the local disk and then
>> will be failed.  So is there a way we can write the shuffle spill data to
>> HDFS? Or if we introduce alluxio in our system, can the shuffled data write
>> to alluxio?
>>
>> Thanks and Regards,
>>
>> --
>> 阎志涛(Tony)
>>
>> 北京腾云天下科技有限公司
>> -
>> ---
>> 邮箱:tony@tendcloud.com
>> 电话:13911815695
>> 微信: zhitao_yan
>> QQ : 4707059
>> 地址:北京市东城区东直门外大街39号院2号楼航空服务大厦602室
>> 邮编:100027
>> 
>> 
>> TalkingData.com  - 让数据说话
>>
>
>
>


Re: Can we redirect Spark shuffle spill data to HDFS or Alluxio?

2016-08-24 Thread Sun Rui
For HDFS, maybe you can try mount HDFS as NFS. But not sure about the 
stability, and also there is additional overhead of network I/O and replica of 
HDFS files.
> On Aug 24, 2016, at 21:02, Saisai Shao  wrote:
> 
> Spark Shuffle uses Java File related API to create local dirs and R/W data, 
> so it can only be worked with OS supported FS. It doesn't leverage Hadoop 
> FileSystem API, so writing to Hadoop compatible FS is not worked.
> 
> Also it is not suitable to write temporary shuffle data into distributed FS, 
> this will bring unnecessary overhead. In you case if you have large memory on 
> each node, you could use ramfs instead to store shuffle data.
> 
> Thanks
> Saisai
> 
> On Wed, Aug 24, 2016 at 8:11 PM, tony@tendcloud.com 
>   > wrote:
> Hi, All,
> When we run Spark on very large data, spark will do shuffle and the shuffle 
> data will write to local disk. Because we have limited capacity at local 
> disk, the shuffled data will occupied all of the local disk and then will be 
> failed.  So is there a way we can write the shuffle spill data to HDFS? Or if 
> we introduce alluxio in our system, can the shuffled data write to alluxio?
> 
> Thanks and Regards,
> 
> 阎志涛(Tony)
> 
> 北京腾云天下科技有限公司
> 
> 邮箱:tony@tendcloud.com 
> 电话:13911815695
> 微信: zhitao_yan
> QQ : 4707059
> 地址:北京市东城区东直门外大街39号院2号楼航空服务大厦602室
> 邮编:100027
> 
> TalkingData.com  - 让数据说话
> 



Re: Can we redirect Spark shuffle spill data to HDFS or Alluxio?

2016-08-24 Thread Saisai Shao
Spark Shuffle uses Java File related API to create local dirs and R/W data,
so it can only be worked with OS supported FS. It doesn't leverage Hadoop
FileSystem API, so writing to Hadoop compatible FS is not worked.

Also it is not suitable to write temporary shuffle data into distributed
FS, this will bring unnecessary overhead. In you case if you have large
memory on each node, you could use ramfs instead to store shuffle data.

Thanks
Saisai

On Wed, Aug 24, 2016 at 8:11 PM, tony@tendcloud.com <
tony@tendcloud.com> wrote:

> Hi, All,
> When we run Spark on very large data, spark will do shuffle and the
> shuffle data will write to local disk. Because we have limited capacity at
> local disk, the shuffled data will occupied all of the local disk and then
> will be failed.  So is there a way we can write the shuffle spill data to
> HDFS? Or if we introduce alluxio in our system, can the shuffled data write
> to alluxio?
>
> Thanks and Regards,
>
> --
> 阎志涛(Tony)
>
> 北京腾云天下科技有限公司
> -
> ---
> 邮箱:tony@tendcloud.com
> 电话:13911815695
> 微信: zhitao_yan
> QQ : 4707059
> 地址:北京市东城区东直门外大街39号院2号楼航空服务大厦602室
> 邮编:100027
> 
> 
> TalkingData.com  - 让数据说话
>