[ https://issues.apache.org/jira/browse/SPARK-15352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035061#comment-16035061 ]
Dongjoon Hyun commented on SPARK-15352: --------------------------------------- Thank you, [~shubhamc]! > Topology aware block replication > -------------------------------- > > Key: SPARK-15352 > URL: https://issues.apache.org/jira/browse/SPARK-15352 > Project: Spark > Issue Type: New Feature > Components: Block Manager, Mesos, Spark Core, YARN > Reporter: Shubham Chopra > Assignee: Shubham Chopra > Fix For: 2.2.0 > > > With cached RDDs, Spark can be used for online analytics where it is used to > respond to online queries. But loss of RDD partitions due to node/executor > failures can cause huge delays in such use cases as the data would have to be > regenerated. > Cached RDDs, even when using multiple replicas per block, are not currently > resilient to node failures when multiple executors are started on the same > node. Block replication currently chooses a peer at random, and this peer > could also exist on the same host. > This effort would add topology aware replication to Spark that can be enabled > with pluggable strategies. For ease of development/review, this is being > broken down to three major work-efforts: > 1. Making peer selection for replication pluggable > 2. Providing pluggable implementations for providing topology and topology > aware replication > 3. Pro-active replenishment of lost blocks -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org