[GitHub] spark pull request #21758: [SPARK-24795][CORE] Implement barrier execution m...

cloud-fan Fri, 20 Jul 2018 07:38:59 -0700

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21758#discussion_r204063051
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala 
---
    @@ -1386,29 +1418,90 @@ class DAGScheduler(
                   )
                 }
               }
    -          // Mark the map whose fetch failed as broken in the map stage
    -          if (mapId != -1) {
    -            mapOutputTracker.unregisterMapOutput(shuffleId, mapId, 
bmAddress)
    -          }
    +        }
     
    -          // TODO: mark the executor as failed only if there were lots of 
fetch failures on it
    -          if (bmAddress != null) {
    -            val hostToUnregisterOutputs = if 
(env.blockManager.externalShuffleServiceEnabled &&
    -              unRegisterOutputOnHostOnFetchFailure) {
    -              // We had a fetch failure with the external shuffle service, 
so we
    -              // assume all shuffle data on the node is bad.
    -              Some(bmAddress.host)
    -            } else {
    -              // Unregister shuffle data just for one executor (we don't 
have any
    -              // reason to believe shuffle data has been lost for the 
entire host).
    -              None
    +      case failure: TaskFailedReason if task.isBarrier =>
    +        // Also handle the task failed reasons here.
    +        failure match {
    +          case Resubmitted =>
    +            logInfo("Resubmitted " + task + ", so marking it as still 
running")
    --- End diff --
    
    these code are used twice, shall we create a `def handleResubmittedTask`?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21758: [SPARK-24795][CORE] Implement barrier execution m...

Reply via email to