[jira] [Commented] (SPARK-22229) SPIP: RDMA Accelerated Shuffle Engine

Saisai Shao (JIRA) Thu, 12 Oct 2017 18:14:57 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-22229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202900#comment-16202900
 ]


Saisai Shao commented on SPARK-22229:
-------------------------------------

{quote}
I don't think that limited familiarity with a new promising feature is a good 
enough reason to avoid it. If every new feature will be treated this way, then 
new technologies will never get introduced to Spark.
{quote}

[~yuvaldeg] I think you might misunderstand my points. I'm not saying that 
Spark will never introduce new technologies, my point is that if the technology 
is not only promising enough, but also has a large amount of audience, of 
course we should bring in it, like k8s support. AFAIK RDMA adoption is not so 
common in big data area.

Just my two cents.

> SPIP: RDMA Accelerated Shuffle Engine
> -------------------------------------
>
>                 Key: SPARK-22229
>                 URL: https://issues.apache.org/jira/browse/SPARK-22229
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 2.3.0
>            Reporter: Yuval Degani
>         Attachments: 
> SPARK-22229_SPIP_RDMA_Accelerated_Shuffle_Engine_Rev_1.0.pdf
>
>
> An RDMA-accelerated shuffle engine can provide enormous performance benefits 
> to shuffle-intensive Spark jobs, as demonstrated in the “SparkRDMA” plugin 
> open-source project ([https://github.com/Mellanox/SparkRDMA]).
> Using RDMA for shuffle improves CPU utilization significantly and reduces I/O 
> processing overhead by bypassing the kernel and networking stack as well as 
> avoiding memory copies entirely. Those valuable CPU cycles are then consumed 
> directly by the actual Spark workloads, and help reducing the job runtime 
> significantly. 
> This performance gain is demonstrated with both industry standard HiBench 
> TeraSort (shows 1.5x speedup in sorting) as well as shuffle intensive 
> customer applications. 
> SparkRDMA will be presented at Spark Summit 2017 in Dublin 
> ([https://spark-summit.org/eu-2017/events/accelerating-shuffle-a-tailor-made-rdma-solution-for-apache-spark/]).
> Please see attached proposal document for more information.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-22229) SPIP: RDMA Accelerated Shuffle Engine

Reply via email to