[jira] [Commented] (SPARK-12196) Store blocks in storage devices with hierarchy way

Apache Spark (JIRA) Mon, 07 Dec 2015 23:59:02 -0800

    [ 
https://issues.apache.org/jira/browse/SPARK-12196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15046544#comment-15046544
 ]


Apache Spark commented on SPARK-12196:
--------------------------------------

User 'yucai' has created a pull request for this issue:
https://github.com/apache/spark/pull/10192

> Store blocks in storage devices with hierarchy way
> --------------------------------------------------
>
>                 Key: SPARK-12196
>                 URL: https://issues.apache.org/jira/browse/SPARK-12196
>             Project: Spark
>          Issue Type: New Feature
>          Components: Spark Core
>            Reporter: yucai
>
> Problem:
>     Nowadays, users have both SSDs and HDDs. 
>     SSDs have great performance, but capacity is low. HDDs have good 
> capacity, but x2-x3 lower than SSDs.
>     How can we get both good?
> Solution:
>     Our idea is to build hierarchy store: use SSDs as cache and HDDs as 
> backup storage. 
>     When Spark core allocates blocks for RDD (either shuffle or RDD cache), 
> it gets blocks from SSDs first, and when SSD’s useable space is less than 
> some threshold, getting blocks from HDDs.
> In our implementation, we actually go further. We support a way to build any 
> level hierarchy store access all storage medias (NVM, SSD, HDD etc.).
> Performance:
>     1. At the best case, our solution performs the same as all SSDs.
>         At the worst case, like all data are spilled to HDDs, no performance 
> regression.
>     2. Compared with all HDDs, hierarchy store improves more than x1.86 (it 
> could be higher, CPU reaches bottleneck in our test environment).
>     3. Compared with Tachyon, our hierarchy store still x1.3 faster. Because 
> we support both RDD cache and shuffle and no extra inter process 
> communication.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-12196) Store blocks in storage devices with hierarchy way

Reply via email to