Re: [PR] [SPARK-46947][CORE] Delay memory manager initialization until Driver plugin is loaded [spark]

via GitHub Thu, 22 Feb 2024 16:47:24 -0800


sunchao commented on code in PR #45052:
URL: https://github.com/apache/spark/pull/45052#discussion_r1500109761



##########
core/src/main/scala/org/apache/spark/storage/BlockManager.scala:
##########
@@ -177,15 +177,17 @@ private[spark] class HostLocalDirManager(
  * Manager running on every node (driver and executors) which provides 
interfaces for putting and
  * retrieving blocks both locally and remotely into various stores (memory, 
disk, and off-heap).
  *
- * Note that [[initialize()]] must be called before the BlockManager is usable.
+ * Note that [[initialize()]] must be called before the BlockManager is 
usable. Also, the
+ * `memoryManager` is initialized at a later stage after DriverPlugin is 
loaded, to allow the
+ * plugin to overwrite memory configurations.
  */
 private[spark] class BlockManager(
     val executorId: String,
     rpcEnv: RpcEnv,
     val master: BlockManagerMaster,
     val serializerManager: SerializerManager,
     val conf: SparkConf,
-    memoryManager: MemoryManager,
+    var memoryManager: MemoryManager,

Review Comment:
   Sure. It is "job group with interruption". I think it is flaky though and 
doesn't always happen. When I tried this locally it doesn't always reproduce. 
The job link: 
https://github.com/sunchao/spark/actions/runs/7923522243/job/21637267400
   
   ```
   [info] - job group with interruption *** FAILED *** (34 milliseconds)
   [info]   java.lang.NullPointerException: Cannot invoke 
"org.apache.spark.memory.MemoryManager.maxOnHeapStorageMemory()" because the 
return value of "org.apache.spark.storage.BlockManager.memoryManager()" is null
   [info]   at 
org.apache.spark.storage.BlockManager.maxOnHeapMemory$lzycompute(BlockManager.scala:243)
   [info]   at 
org.apache.spark.storage.BlockManager.maxOnHeapMemory(BlockManager.scala:243)
   [info]   at 
org.apache.spark.storage.BlockManager.initialize(BlockManager.scala:565)
   [info]   at org.apache.spark.SparkContext.<init>(SparkContext.scala:633)
   [info]   at org.apache.spark.SparkContext.<init>(SparkContext.scala:159)
   [info]   at org.apache.spark.SparkContext.<init>(SparkContext.scala:172)
   [info]   at 
org.apache.spark.JobCancellationSuite.$anonfun$new$41(JobCancellationSuite.scala:397)
   [info]   at 
org.scalatest.enablers.Timed$$anon$1.timeoutAfter(Timed.scala:127)
   [info]   at 
org.scalatest.concurrent.TimeLimits$.failAfterImpl(TimeLimits.scala:282)
   [info]   at 
org.scalatest.concurrent.TimeLimits.failAfter(TimeLimits.scala:231)
   [info]   at 
org.scalatest.concurrent.TimeLimits.failAfter$(TimeLimits.scala:230)
   [info]   at org.apache.spark.SparkFunSuite.failAfter(SparkFunSuite.scala:69)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Re: [PR] [SPARK-46947][CORE] Delay memory manager initialization until Driver plugin is loaded [spark]

Reply via email to