[ https://issues.apache.org/jira/browse/SPARK-43496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17856508#comment-17856508 ]
James Boylan edited comment on SPARK-43496 at 6/20/24 2:30 PM: --------------------------------------------------------------- I can't emphasis enough how important this feature is, and how badly it is needed. Also, we should update the ticket to show that it impacts 3.4.0 and 3.5.0. I agree having a default behavior of just setting the limits and requests based on the Cores and Memory setting makes sense for default, the configuration is completely counter to standard Kubernetes practice and actually makes it difficult to manage Spark processes on a cluster in a cost effective manner. [~julienlau] *said:* {quote}new options: _spark.kubernetes.driver.requests.cpu_ _spark.kubernetes.driver.requests.memory_ _spark.kubernetes.driver.limits.cpu_ _spark.kubernetes.driver.limits.memory_ _spark.kubernetes.executor.requests.cpu_ _spark.kubernetes.executor.requests.memory_ _spark.kubernetes.executor.limits.cpu_ _spark.kubernetes.executor.limits.memory_ if unset then stay consistent with current behavior if set to 0 then disable this definition This would also solve the issue that driver/executor core is defined as an Integer and cannot be 0.5 for a driver. {quote} Honestly, this would be the absolute perfect implementation of the feature, and line up exactly how applications should support Kubernetes. This is an area that Spark is painfully losing out to applications like Flink. Since Flink does not manage the creation of the Task Managers, it allows administrators to build out the manifest to specifically meet the needs of their environment. I understand why Spark does manage the executor deployments, and I agree with the reasoning, the configuration options need to be available to handle all of the settings required within the deployment onto Kubernetes. This is almost entirely handled by the pod templates, with the exception of Memory and Core limits/requests settings. was (Author: drahkar): I can't emphasis enough how important this feature is, and how badly it is needed. I agree having a default behavior of just setting the limits and requests based on the Cores and Memory setting makes sense for default, the configuration is completely counter to standard Kubernetes practice and actually makes it difficult to manage Spark processes on a cluster in a cost effective manner. [~julienlau] *said:* {quote}new options: _spark.kubernetes.driver.requests.cpu_ _spark.kubernetes.driver.requests.memory_ _spark.kubernetes.driver.limits.cpu_ _spark.kubernetes.driver.limits.memory_ _spark.kubernetes.executor.requests.cpu_ _spark.kubernetes.executor.requests.memory_ _spark.kubernetes.executor.limits.cpu_ _spark.kubernetes.executor.limits.memory_ if unset then stay consistent with current behavior if set to 0 then disable this definition This would also solve the issue that driver/executor core is defined as an Integer and cannot be 0.5 for a driver.{quote} Honestly, this would be the absolute perfect implementation of the feature, and line up exactly how applications should support Kubernetes. This is an area that Spark is painfully losing out to applications like Flink. Since Flink does not manage the creation of the Task Managers, it allows administrators to build out the manifest to specifically meet the needs of their environment. I understand why Spark does manage the executor deployments, and I agree with the reasoning, the configuration options need to be available to handle all of the settings required within the deployment onto Kubernetes. This is almost entirely handled by the pod templates, with the exception of Memory and Core limits/requests settings. > Have a separate config for Memory limits for kubernetes pods > ------------------------------------------------------------ > > Key: SPARK-43496 > URL: https://issues.apache.org/jira/browse/SPARK-43496 > Project: Spark > Issue Type: Improvement > Components: Kubernetes > Affects Versions: 3.4.0 > Reporter: Alexander Yerenkow > Priority: Major > Labels: pull-request-available > > Whole allocated memory to JVM is set into pod resources as both request and > limits. > This means there's not a way to use more memory for burst-like jobs in a > shared environment. > For example, if spark job uses external process (outside of JVM) to access > data, a bit of extra memory required for that, and having configured higher > limits for mem could be of use. > Another thought here - have a way to configure different JVM/ pod memory > request also could be a valid use case. > > Github PR: [https://github.com/apache/spark/pull/41067] > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org