[ 
https://issues.apache.org/jira/browse/SPARK-1767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13993909#comment-13993909
 ] 

Sandy Ryza commented on SPARK-1767:
-----------------------------------

Currently, RDDs only support a single level of location preference through 
RDD#preferredLocations(split), which returns a sequence of strings.  To prefer 
cached-replicas, this needs to be extended in some way.  We could deprecate 
preferredLocations and add a preferredLocations(split, storageType), where 
storageType is MEMORY, DISK, and eventually FLASH?  Maybe more hackily, we 
could give the location strings a prefix like "inmem:" that specifies the 
storage type.

> Prefer HDFS-cached replicas when scheduling data-local tasks
> ------------------------------------------------------------
>
>                 Key: SPARK-1767
>                 URL: https://issues.apache.org/jira/browse/SPARK-1767
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 1.0.0
>            Reporter: Sandy Ryza
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to