[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #38943: [SPARK-41410][K8S] Support PVC-oriented executor pod allocation

2022-12-06 Thread GitBox


dongjoon-hyun commented on code in PR #38943:
URL: https://github.com/apache/spark/pull/38943#discussion_r1041445654


##
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala:
##
@@ -47,6 +48,17 @@ class ExecutorPodsAllocator(
 
   private val EXECUTOR_ID_COUNTER = new AtomicInteger(0)
 
+  private val PVC_COUNTER = new AtomicInteger(0)
+
+  private val maxPVCs = if (Utils.isDynamicAllocationEnabled(conf)) {
+conf.get(DYN_ALLOCATION_MAX_EXECUTORS)
+  } else {
+conf.getInt(EXECUTOR_INSTANCES.key, DEFAULT_NUMBER_EXECUTORS)
+  }
+
+  private val reusePVC = conf.get(KUBERNETES_DRIVER_OWN_PVC) &&
+conf.get(KUBERNETES_DRIVER_REUSE_PVC) && 
conf.get(KUBERNETES_DRIVER_WAIT_TO_REUSE_PVC)

Review Comment:
   You are right. Let me rename~



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #38943: [SPARK-41410][K8S] Support PVC-oriented executor pod allocation

2022-12-06 Thread GitBox


dongjoon-hyun commented on code in PR #38943:
URL: https://github.com/apache/spark/pull/38943#discussion_r1041376046


##
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala:
##
@@ -398,6 +410,10 @@ class ExecutorPodsAllocator(
 // Check reusable PVCs for this executor allocation batch
 val reusablePVCs = getReusablePVCs(applicationId, pvcsInUse)
 for ( _ <- 0 until numExecutorsToAllocate) {
+  if (reusablePVCs.isEmpty && reusePVC && maxPVCs <= PVC_COUNTER.get()) {

Review Comment:
   Yes, correct! When we have `reusablePVCs`, `PVC-oriented executor pod 
allocation` doesn't need to be blocked. We halts `executor allocation` only 
when there is no available PVCs and reached `PVC_COUNTER` is greater than or 
equal to the maximum .



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #38943: [SPARK-41410][K8S] Support PVC-oriented executor pod allocation

2022-12-06 Thread GitBox


dongjoon-hyun commented on code in PR #38943:
URL: https://github.com/apache/spark/pull/38943#discussion_r1041347366


##
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala:
##
@@ -101,6 +100,17 @@ private[spark] object Config extends Logging {
   .booleanConf
   .createWithDefault(true)
 
+  val KUBERNETES_DRIVER_WAIT_TO_REUSE_PVC =
+ConfigBuilder("spark.kubernetes.driver.waitToReusePersistentVolumeClaims")
+  .doc("If true, driver pod counts the number of created on-demand 
persistent volume claims " +
+s"and wait if the number is greater than or equal to the maximum which 
is " +
+s"${EXECUTOR_INSTANCES.key} or ${DYN_ALLOCATION_MAX_EXECUTORS.key}. " +
+s"This config requires both ${KUBERNETES_DRIVER_OWN_PVC.key}=true and 
" +
+s"${KUBERNETES_DRIVER_REUSE_PVC.key}=true.")

Review Comment:
   Yes, initially, I tried to use it as a config name but `PVC-oriented 
executor pod allocation` was achieved by three configurations.
   - spark.kubernetes.driver.waitToReusePersistentVolumeClaims
   - spark.kubernetes.driver.ownPersistentVolumeClaims
   - spark.kubernetes.driver.reusePersistentVolumeClaims
   
   I'll add a K8s document section with that name.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #38943: [SPARK-41410][K8S] Support PVC-oriented executor pod allocation

2022-12-06 Thread GitBox


dongjoon-hyun commented on code in PR #38943:
URL: https://github.com/apache/spark/pull/38943#discussion_r1041345486


##
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala:
##
@@ -398,6 +410,10 @@ class ExecutorPodsAllocator(
 // Check reusable PVCs for this executor allocation batch
 val reusablePVCs = getReusablePVCs(applicationId, pvcsInUse)
 for ( _ <- 0 until numExecutorsToAllocate) {
+  if (reusablePVCs.isEmpty && reusePVC && maxPVCs <= PVC_COUNTER.get()) {

Review Comment:
   Thank you for review.
   
   Theoretically, `reusablePVCs` are all driver-owned PVCs whose creation time 
is bigger than `podAllocationDelay` + now. So, it can be bigger than `maxPVCs` 
is there is other PVC creation logic (For example, Spark driver plugin).
   
   
https://github.com/apache/spark/blob/89b2ee27d258dec8fe265fa862846e800a374d8e/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala#L364-L382
   
   Also, previously, Spark creates new pod and PVCs when some executors are 
dead. In that case, PVCs could be created a little more.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org