subject:"\[GitHub\] \[spark\] Ngone51 commented on a change in pull request #26980\: \[SPARK\-27348\]\[Core\] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend"

[GitHub] [spark] Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend

2019-12-25 Thread GitBox

Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] 
HeartbeatReceiver should remove lost executors from 
CoarseGrainedSchedulerBackend
URL: https://github.com/apache/spark/pull/26980#discussion_r361350788
 
 

 ##
 File path: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala
 ##
 @@ -199,14 +201,25 @@ private[spark] class HeartbeatReceiver(sc: SparkContext, 
clock: Clock)
   if (now - lastSeenMs > executorTimeoutMs) {
 logWarning(s"Removing executor $executorId with no recent heartbeats: 
" +
   s"${now - lastSeenMs} ms exceeds timeout $executorTimeoutMs ms")
-scheduler.executorLost(executorId, SlaveLost("Executor heartbeat " +
 
 Review comment:
   > Since we assume killing executors may take a while, looks reasonable to 
call scheduler.executorLost and re-submit tasks earlier.
   
   I noticed a wired problem of this. That is when we call 
`scheduler.executorLost` in `HeartbeatReceiver`, `scheduler` knows which 
executor lost, but `CoarseGrainedSchedulerBackend` doesn't. So, even if we want 
to re-submit tasks earlier, tasks can still be launched on that lost executor.
   
   But for other two places where we call `scheduler.executorLost`, it 
guarantees that `CoarseGrainedSchedulerBackend` has realized about the 
lost/killed executor.
   
   So, I think that `HeartbeatReceiver` may has used it incorrectly, here.
   
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend

2019-12-25 Thread GitBox

Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] 
HeartbeatReceiver should remove lost executors from 
CoarseGrainedSchedulerBackend
URL: https://github.com/apache/spark/pull/26980#discussion_r361348573
 
 

 ##
 File path: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala
 ##
 @@ -199,14 +201,25 @@ private[spark] class HeartbeatReceiver(sc: SparkContext, 
clock: Clock)
   if (now - lastSeenMs > executorTimeoutMs) {
 logWarning(s"Removing executor $executorId with no recent heartbeats: 
" +
   s"${now - lastSeenMs} ms exceeds timeout $executorTimeoutMs ms")
-scheduler.executorLost(executorId, SlaveLost("Executor heartbeat " +
 
 Review comment:
   No serious hurt, but gives waring/error says that the executor has already 
been removed, etc.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend

2019-12-24 Thread GitBox

Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] 
HeartbeatReceiver should remove lost executors from 
CoarseGrainedSchedulerBackend
URL: https://github.com/apache/spark/pull/26980#discussion_r361243811
 
 

 ##
 File path: core/src/test/scala/org/apache/spark/HeartbeatReceiverSuite.scala
 ##
 @@ -153,7 +154,6 @@ class HeartbeatReceiverSuite
 heartbeatReceiverClock.advance(executorTimeout)
 heartbeatReceiverRef.askSync[Boolean](ExpireDeadHosts)
 // Only the second executor should be expired as a dead host
-verify(scheduler).executorLost(meq(executorId2), any())
 
 Review comment:
   Remove this because we no longer invoke `scheduler.executorLost` in 
`HeartbeatReceiver`'s main thread. And asserts below have already proved that 
`executorId2` has been expired. So, I think it should be find to remove.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend

2019-12-24 Thread GitBox

Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] 
HeartbeatReceiver should remove lost executors from 
CoarseGrainedSchedulerBackend
URL: https://github.com/apache/spark/pull/26980#discussion_r361243807
 
 

 ##
 File path: core/src/test/scala/org/apache/spark/HeartbeatReceiverSuite.scala
 ##
 @@ -153,7 +154,6 @@ class HeartbeatReceiverSuite
 heartbeatReceiverClock.advance(executorTimeout)
 heartbeatReceiverRef.askSync[Boolean](ExpireDeadHosts)
 // Only the second executor should be expired as a dead host
-verify(scheduler).executorLost(meq(executorId2), any())
 
 Review comment:
   Remove this because we no longer invoke `scheduler.executorLost` in 
`HeartbeatReceiver`'s main thread. And asserts below have already proved that 
`executorId2` has been expired. So, I think it should be find to remove.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend

2019-12-24 Thread GitBox

Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] 
HeartbeatReceiver should remove lost executors from 
CoarseGrainedSchedulerBackend
URL: https://github.com/apache/spark/pull/26980#discussion_r361243811
 
 

 ##
 File path: core/src/test/scala/org/apache/spark/HeartbeatReceiverSuite.scala
 ##
 @@ -153,7 +154,6 @@ class HeartbeatReceiverSuite
 heartbeatReceiverClock.advance(executorTimeout)
 heartbeatReceiverRef.askSync[Boolean](ExpireDeadHosts)
 // Only the second executor should be expired as a dead host
-verify(scheduler).executorLost(meq(executorId2), any())
 
 Review comment:
   Remove this because we no longer invoke `scheduler.executorLost` in 
`HeartbeatReceiver`'s main thread. And asserts below have already proved that 
`executorId2` has been expired. So, I think it should be fine to remove.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend

2019-12-23 Thread GitBox

Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] 
HeartbeatReceiver should remove lost executors from 
CoarseGrainedSchedulerBackend
URL: https://github.com/apache/spark/pull/26980#discussion_r360858546
 
 

 ##
 File path: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala
 ##
 @@ -199,14 +201,20 @@ private[spark] class HeartbeatReceiver(sc: SparkContext, 
clock: Clock)
   if (now - lastSeenMs > executorTimeoutMs) {
 logWarning(s"Removing executor $executorId with no recent heartbeats: 
" +
   s"${now - lastSeenMs} ms exceeds timeout $executorTimeoutMs ms")
-scheduler.executorLost(executorId, SlaveLost("Executor heartbeat " +
-  s"timed out after ${now - lastSeenMs} ms"))
 
 Review comment:
   I think this is needed for non-`TaskSchedulerImpl` ?
   
   This can be a behavior change for other TaskScheduler, I'll revert this 
later.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend

2019-12-23 Thread GitBox

Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] 
HeartbeatReceiver should remove lost executors from 
CoarseGrainedSchedulerBackend
URL: https://github.com/apache/spark/pull/26980#discussion_r360819689
 
 

 ##
 File path: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala
 ##
 @@ -199,14 +201,20 @@ private[spark] class HeartbeatReceiver(sc: SparkContext, 
clock: Clock)
   if (now - lastSeenMs > executorTimeoutMs) {
 logWarning(s"Removing executor $executorId with no recent heartbeats: 
" +
   s"${now - lastSeenMs} ms exceeds timeout $executorTimeoutMs ms")
-scheduler.executorLost(executorId, SlaveLost("Executor heartbeat " +
-  s"timed out after ${now - lastSeenMs} ms"))
-  // Asynchronously kill the executor to avoid blocking the current 
thread
+// Asynchronously kill the executor to avoid blocking the current 
thread
 killExecutorThread.submit(new Runnable {
   override def run(): Unit = Utils.tryLogNonFatalError {
 // Note: we want to get an executor back after expiring this one,
 // so do not simply call `sc.killExecutor` here (SPARK-8119)
 sc.killAndReplaceExecutor(executorId)
+// In case of the executors which are not gracefully shut down, we 
should remove
 
 Review comment:
   Yeah, only.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend

2019-12-23 Thread GitBox

Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] 
HeartbeatReceiver should remove lost executors from 
CoarseGrainedSchedulerBackend
URL: https://github.com/apache/spark/pull/26980#discussion_r360819557
 
 

 ##
 File path: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala
 ##
 @@ -199,14 +201,20 @@ private[spark] class HeartbeatReceiver(sc: SparkContext, 
clock: Clock)
   if (now - lastSeenMs > executorTimeoutMs) {
 logWarning(s"Removing executor $executorId with no recent heartbeats: 
" +
   s"${now - lastSeenMs} ms exceeds timeout $executorTimeoutMs ms")
-scheduler.executorLost(executorId, SlaveLost("Executor heartbeat " +
-  s"timed out after ${now - lastSeenMs} ms"))
 
 Review comment:
   If we fail to remove it from `CoarseGrainedSchedulerBackend`, then, it means 
the executor has been removed previously(e.g. `onDisconnected()`). And 
`scheduler.executorLost()` must has been called, too.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend

2019-12-23 Thread GitBox

Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] 
HeartbeatReceiver should remove lost executors from 
CoarseGrainedSchedulerBackend
URL: https://github.com/apache/spark/pull/26980#discussion_r360809654
 
 

 ##
 File path: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala
 ##
 @@ -199,14 +201,20 @@ private[spark] class HeartbeatReceiver(sc: SparkContext, 
clock: Clock)
   if (now - lastSeenMs > executorTimeoutMs) {
 logWarning(s"Removing executor $executorId with no recent heartbeats: 
" +
   s"${now - lastSeenMs} ms exceeds timeout $executorTimeoutMs ms")
-scheduler.executorLost(executorId, SlaveLost("Executor heartbeat " +
-  s"timed out after ${now - lastSeenMs} ms"))
-  // Asynchronously kill the executor to avoid blocking the current 
thread
+// Asynchronously kill the executor to avoid blocking the current 
thread
 killExecutorThread.submit(new Runnable {
   override def run(): Unit = Utils.tryLogNonFatalError {
 // Note: we want to get an executor back after expiring this one,
 // so do not simply call `sc.killExecutor` here (SPARK-8119)
 sc.killAndReplaceExecutor(executorId)
+// In case of the executors which are not gracefully shut down, we 
should remove
 
 Review comment:
   Yeah, in this case, `CoarseGrainedSchedulerBackend` can't handle 
non-graceful executor shutdown well. But I am not sure what do you mean by 
"only", here? What other factors do you also care about?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend

2019-12-22 Thread GitBox

Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] 
HeartbeatReceiver should remove lost executors from 
CoarseGrainedSchedulerBackend
URL: https://github.com/apache/spark/pull/26980#discussion_r360796114
 
 

 ##
 File path: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala
 ##
 @@ -199,14 +201,20 @@ private[spark] class HeartbeatReceiver(sc: SparkContext, 
clock: Clock)
   if (now - lastSeenMs > executorTimeoutMs) {
 logWarning(s"Removing executor $executorId with no recent heartbeats: 
" +
   s"${now - lastSeenMs} ms exceeds timeout $executorTimeoutMs ms")
-scheduler.executorLost(executorId, SlaveLost("Executor heartbeat " +
-  s"timed out after ${now - lastSeenMs} ms"))
-  // Asynchronously kill the executor to avoid blocking the current 
thread
+// Asynchronously kill the executor to avoid blocking the current 
thread
 killExecutorThread.submit(new Runnable {
   override def run(): Unit = Utils.tryLogNonFatalError {
 // Note: we want to get an executor back after expiring this one,
 // so do not simply call `sc.killExecutor` here (SPARK-8119)
 sc.killAndReplaceExecutor(executorId)
+// In case of the executors which are not gracefully shut down, we 
should remove
 
 Review comment:
   No, no.
   
   Please pay attention to the comment above and note that, here, we're not 
saying that this issue happens due to `CoarseGrainedSchedulerBackend` may shut 
down executors non-gracefully. But we say that it may happens due to the 
executor itself is not gracefully shut down.
   
   It's possible that `CoarseGrainedSchedulerBackend` asks someone executor to 
shutdown and the executor may shutdown  non-gracefully. And it's also possible 
that the host where the executor located suddenly crushed that makes the 
executor shutdown non-gracefully. 
   
   Shutdown non-gracefully here only means that an executor shutdown but 
`CoarseGrainedSchedulerBackend` doesn't receive disconnect event from it. It 
really doesn't matter what causes the executor shutdown non-gracefully.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend

2019-12-22 Thread GitBox

Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] 
HeartbeatReceiver should remove lost executors from 
CoarseGrainedSchedulerBackend
URL: https://github.com/apache/spark/pull/26980#discussion_r360790014
 
 

 ##
 File path: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala
 ##
 @@ -199,14 +201,20 @@ private[spark] class HeartbeatReceiver(sc: SparkContext, 
clock: Clock)
   if (now - lastSeenMs > executorTimeoutMs) {
 logWarning(s"Removing executor $executorId with no recent heartbeats: 
" +
   s"${now - lastSeenMs} ms exceeds timeout $executorTimeoutMs ms")
-scheduler.executorLost(executorId, SlaveLost("Executor heartbeat " +
-  s"timed out after ${now - lastSeenMs} ms"))
-  // Asynchronously kill the executor to avoid blocking the current 
thread
+// Asynchronously kill the executor to avoid blocking the current 
thread
 killExecutorThread.submit(new Runnable {
   override def run(): Unit = Utils.tryLogNonFatalError {
 // Note: we want to get an executor back after expiring this one,
 // so do not simply call `sc.killExecutor` here (SPARK-8119)
 sc.killAndReplaceExecutor(executorId)
+// In case of the executors which are not gracefully shut down, we 
should remove
 
 Review comment:
   Match `CoarseGrainedSchedulerBackend` covers all scheduler 
backend(Standalone, YARN, K8S, Mesos) that officially supported by Spark and we 
can guarantee that this issue can be resolved internally. But for other plugged 
scheduler backend, Spark seems has nothing to do with current 
`SchedulerBackend` interface at least for this issue.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend

2019-12-22 Thread GitBox

Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] 
HeartbeatReceiver should remove lost executors from 
CoarseGrainedSchedulerBackend
URL: https://github.com/apache/spark/pull/26980#discussion_r360782760
 
 

 ##
 File path: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala
 ##
 @@ -199,14 +201,20 @@ private[spark] class HeartbeatReceiver(sc: SparkContext, 
clock: Clock)
   if (now - lastSeenMs > executorTimeoutMs) {
 logWarning(s"Removing executor $executorId with no recent heartbeats: 
" +
   s"${now - lastSeenMs} ms exceeds timeout $executorTimeoutMs ms")
-scheduler.executorLost(executorId, SlaveLost("Executor heartbeat " +
-  s"timed out after ${now - lastSeenMs} ms"))
 
 Review comment:
   Hmm...This may can be a problem...


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend

2019-12-22 Thread GitBox

Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] 
HeartbeatReceiver should remove lost executors from 
CoarseGrainedSchedulerBackend
URL: https://github.com/apache/spark/pull/26980#discussion_r360782486
 
 

 ##
 File path: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala
 ##
 @@ -199,14 +201,20 @@ private[spark] class HeartbeatReceiver(sc: SparkContext, 
clock: Clock)
   if (now - lastSeenMs > executorTimeoutMs) {
 logWarning(s"Removing executor $executorId with no recent heartbeats: 
" +
   s"${now - lastSeenMs} ms exceeds timeout $executorTimeoutMs ms")
-scheduler.executorLost(executorId, SlaveLost("Executor heartbeat " +
-  s"timed out after ${now - lastSeenMs} ms"))
-  // Asynchronously kill the executor to avoid blocking the current 
thread
+// Asynchronously kill the executor to avoid blocking the current 
thread
 killExecutorThread.submit(new Runnable {
   override def run(): Unit = Utils.tryLogNonFatalError {
 // Note: we want to get an executor back after expiring this one,
 // so do not simply call `sc.killExecutor` here (SPARK-8119)
 sc.killAndReplaceExecutor(executorId)
+// In case of the executors which are not gracefully shut down, we 
should remove
 
 Review comment:
   > Do you mean CoarseGrainedSchedulerBackend is the only one that may shut 
down executors non-gracefully? 
   
   No. IIUC, it's executor's own behavior but not related to scheduler backend. 
An executor can be self-lost(no heartbeat) but still alive, e.g. due to network 
problem.
   
   > And if it shuts down gracefully, is it OK to have 2 RemoveExecutor events?
   
   Yes, we have protection for this case.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend

2019-12-22 Thread GitBox

Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] 
HeartbeatReceiver should remove lost executors from 
CoarseGrainedSchedulerBackend
URL: https://github.com/apache/spark/pull/26980#discussion_r360714352
 
 

 ##
 File path: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala
 ##
 @@ -199,14 +201,20 @@ private[spark] class HeartbeatReceiver(sc: SparkContext, 
clock: Clock)
   if (now - lastSeenMs > executorTimeoutMs) {
 logWarning(s"Removing executor $executorId with no recent heartbeats: 
" +
   s"${now - lastSeenMs} ms exceeds timeout $executorTimeoutMs ms")
-scheduler.executorLost(executorId, SlaveLost("Executor heartbeat " +
-  s"timed out after ${now - lastSeenMs} ms"))
 
 Review comment:
   Since remove executor in `CoarseGrainedSchedulerBackend` will always do 
`scheduler.executorLost()`, so it's no longer needed here. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend

[GitHub] [spark] Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend

[GitHub] [spark] Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend

[GitHub] [spark] Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend

[GitHub] [spark] Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend

[GitHub] [spark] Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend

[GitHub] [spark] Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend

[GitHub] [spark] Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend

[GitHub] [spark] Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend

[GitHub] [spark] Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend

[GitHub] [spark] Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend

[GitHub] [spark] Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend

[GitHub] [spark] Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend

[GitHub] [spark] Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend

14 matches

Site Navigation

Mail list logo

Footer information