[GitHub] [spark] attilapiros commented on a change in pull request #29211: [SPARK-31197][CORE] Shutdown executor once we are done decommissioning

GitBox Fri, 31 Jul 2020 11:24:13 -0700


attilapiros commented on a change in pull request #29211:
URL: https://github.com/apache/spark/pull/29211#discussion_r463763660




##########
File path: 
core/src/main/scala/org/apache/spark/storage/BlockManagerDecommissioner.scala
##########
@@ -327,4 +354,28 @@ private[storage] class BlockManagerDecommissioner(
     }
     logInfo("Stopped storage decommissioner")
   }
+
+  /*
+   *  Returns the last migration time and a boolean for if all blocks have 
been migrated.
+   *  If there are any tasks running since that time the boolean may be 
incorrect.
+   */
+  private[storage] def lastMigrationInfo(): (Long, Boolean) = {
+    if (stopped || (stoppedRDD && stoppedShuffle)) {
+      (System.nanoTime(), true)
+    } else {
+      // Chose the min of the running times.
+      val lastMigrationTime = if (
+        conf.get(config.STORAGE_DECOMMISSION_SHUFFLE_BLOCKS_ENABLED) &&
+        conf.get(config.STORAGE_DECOMMISSION_RDD_BLOCKS_ENABLED)) {
+        Math.min(lastRDDMigrationTime, lastShuffleMigrationTime)

Review comment:
       Yes, I know this. That is what I meant by
   
   >  (Theoretically one task could be started as the scheduler does not 
processed the DecommissionExecutor message but let's take this corner case out 
as we can have this handled with some sleep).
   
   So I think with some waiting (which I would avoid if I could) the 
probability of this unlikely case can be decreased more.
   
   But let's assume even that very unlikely case happens then we are losing one 
cached RDD from the current stage which will be re-submitted. The advantage on 
the other hand a more simple code which can be fixed/improved by more 
developers.
   
   **Other idea**: leave this logic as it is but refactor the `shutdownThread` 
into a separate class to make it unittestable (all its dependencies would be 
mockable, even the time its uses would be Clock a ManualClock in test).
   
   What do you think?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] attilapiros commented on a change in pull request #29211: [SPARK-31197][CORE] Shutdown executor once we are done decommissioning

Reply via email to