sunchao commented on code in PR #45052:
URL: https://github.com/apache/spark/pull/45052#discussion_r1501618843


##########
core/src/main/scala/org/apache/spark/storage/BlockManager.scala:
##########
@@ -177,15 +177,17 @@ private[spark] class HostLocalDirManager(
  * Manager running on every node (driver and executors) which provides 
interfaces for putting and
  * retrieving blocks both locally and remotely into various stores (memory, 
disk, and off-heap).
  *
- * Note that [[initialize()]] must be called before the BlockManager is usable.
+ * Note that [[initialize()]] must be called before the BlockManager is 
usable. Also, the
+ * `memoryManager` is initialized at a later stage after DriverPlugin is 
loaded, to allow the
+ * plugin to overwrite memory configurations.
  */
 private[spark] class BlockManager(
     val executorId: String,
     rpcEnv: RpcEnv,
     val master: BlockManagerMaster,
     val serializerManager: SerializerManager,
     val conf: SparkConf,
-    memoryManager: MemoryManager,
+    var memoryManager: MemoryManager,

Review Comment:
   I think it may also happen in the non-test path but could be very rare and 
the case only got amplified with this PR. I described the scenario above:
   
   > I'm not sure whether this is a valid case. Looking at the [Task 
class](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/Task.scala),
 it calls SparkEnv.get in several places. However, in the rare cases when a 
SparkContext has been shut down and a new SparkContext is created, running 
tasks launched by the former could access SparkEnv created by the latter, which 
seems not valid. I think this can only happen in the local mode.
   
   The [latest 
changes](https://github.com/apache/spark/pull/45052/files/a6502e06ef2bf68654f1d3c8bb5c483a071d8770..af68ece323401d06f7e036c3b0782bc80a0f4c93)
 updated `Task` to use the `BlockManager` from the `Executor` that launches the 
task, instead of the one from `SparkEnv.get` which could be from a different 
`SparkContext`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to