[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...
Github user aditanase commented on a diff in the pull request: https://github.com/apache/spark/pull/22904#discussion_r240141774 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala --- @@ -67,8 +66,16 @@ private[spark] object SparkKubernetesClientFactory { val dispatcher = new Dispatcher( ThreadUtils.newDaemonCachedThreadPool("kubernetes-dispatcher")) -// TODO [SPARK-25887] Create builder in a way that respects configurable context -val config = new ConfigBuilder() +// Allow for specifying a context used to auto-configure from the users K8S config file +val kubeContext = sparkConf.get(KUBERNETES_CONTEXT).filter(c => StringUtils.isNotBlank(c)) +logInfo(s"Auto-configuring K8S client using " + + s"${if (kubeContext.isEmpty) s"context ${kubeContext.get}" else "current context"}" + + s" from users K8S config file") + +// Start from an auto-configured config with the desired context +// Fabric 8 uses null to indicate that the users current context should be used so if no +// explicit setting pass null +val config = new ConfigBuilder(autoConfigure(kubeContext.getOrElse(null))) --- End diff -- Ok, this is starting to sink in. I just realized that in my deployment scenario, all I need to do is to specify the `serviceAccountName` on _my_ pod template that contains the driver. That leaves me confused though - what other "client mode deployments" do you see happening on K8S? What does client mode mean to you? Also - how should one interpret this paragraph in the docs? ` In client mode, use spark.kubernetes.authenticate.serviceAccountName instead.` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/22904#discussion_r239903144 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala --- @@ -67,8 +66,16 @@ private[spark] object SparkKubernetesClientFactory { val dispatcher = new Dispatcher( ThreadUtils.newDaemonCachedThreadPool("kubernetes-dispatcher")) -// TODO [SPARK-25887] Create builder in a way that respects configurable context -val config = new ConfigBuilder() +// Allow for specifying a context used to auto-configure from the users K8S config file +val kubeContext = sparkConf.get(KUBERNETES_CONTEXT).filter(c => StringUtils.isNotBlank(c)) +logInfo(s"Auto-configuring K8S client using " + + s"${if (kubeContext.isEmpty) s"context ${kubeContext.get}" else "current context"}" + + s" from users K8S config file") + +// Start from an auto-configured config with the desired context +// Fabric 8 uses null to indicate that the users current context should be used so if no +// explicit setting pass null +val config = new ConfigBuilder(autoConfigure(kubeContext.getOrElse(null))) --- End diff -- > I think this enhancement does not apply to client mode If you mean "client mode inside a k8s-managed docker container", then yes, you may need to do extra stuff, like mount the appropriate credentials. But in the "client mode with driver inside k8s pod" case, Spark does not create that pod for you. So I'm not sure how Spark can help with anything there; the `serviceName` configuration seems targeted at propagating the credentials of the submitter to the driver pod, and in that case Spark is not creating the driver pod at all. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...
Github user aditanase commented on a diff in the pull request: https://github.com/apache/spark/pull/22904#discussion_r239732258 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala --- @@ -67,8 +66,16 @@ private[spark] object SparkKubernetesClientFactory { val dispatcher = new Dispatcher( ThreadUtils.newDaemonCachedThreadPool("kubernetes-dispatcher")) -// TODO [SPARK-25887] Create builder in a way that respects configurable context -val config = new ConfigBuilder() +// Allow for specifying a context used to auto-configure from the users K8S config file +val kubeContext = sparkConf.get(KUBERNETES_CONTEXT).filter(c => StringUtils.isNotBlank(c)) +logInfo(s"Auto-configuring K8S client using " + + s"${if (kubeContext.isEmpty) s"context ${kubeContext.get}" else "current context"}" + + s" from users K8S config file") + +// Start from an auto-configured config with the desired context +// Fabric 8 uses null to indicate that the users current context should be used so if no +// explicit setting pass null +val config = new ConfigBuilder(autoConfigure(kubeContext.getOrElse(null))) --- End diff -- This somewhat answers my previous question, and it makes sense. Seems like Fabric8 is behaving consistently with other SDKs (e.g. AWS comes to mind as it will try 3-4 default configuration options, including env vars, iam machine role, etc). That being said, if you were to deploy an app inside a docker container, it won't see any of those and one still needs to inject some env vars or config maps for the AWS SDK to pick up. Long story short, I think this enhancement does not apply to client mode, and we should make that clearer in the docs, while providing alternatives for issues like https://issues.apache.org/jira/browse/SPARK-26295. Hopefully I'm making sense. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...
Github user aditanase commented on a diff in the pull request: https://github.com/apache/spark/pull/22904#discussion_r239730534 --- Diff: docs/running-on-kubernetes.md --- @@ -298,6 +298,16 @@ the Spark application. ## Kubernetes Features +### Configuration File + +Your Kubernetes config file typically lives under `.kube/config` in your home directory or in a location specified by the `KUBECONFIG` environment variable. Spark on Kubernetes will attempt to use this file to do an initial auto-configuration of the Kubernetes client used to interact with the Kubernetes cluster. A variety of Spark configuration properties are provided that allow further customising the client configuration e.g. using an alternative authentication method. --- End diff -- Ok, I just opened https://issues.apache.org/jira/browse/SPARK-26295 and @vanzin redirected me to this thread. Would love your eyes on that issue, see if we can use your work here to close that too. In short, If there is code that propagates the kube context along this path, I'm not aware of it, would love to see some documentation: ``` laptop with kubectl and context > k apply -f spark-driver-client-mode.yaml -> deployment starts 1 instance of driver pod in arbitrary namespace -> spark submit from start.sh inside the docker container -> ... ``` There is no kubectl or "kube context" in the docker container, it's just the spark distro and my jars. So where would the driver pod get the account from? PS: agreed that there are too many config options on the auth side, maybe we could consolidate them more. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/22904#discussion_r239272037 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala --- @@ -67,8 +66,16 @@ private[spark] object SparkKubernetesClientFactory { val dispatcher = new Dispatcher( ThreadUtils.newDaemonCachedThreadPool("kubernetes-dispatcher")) -// TODO [SPARK-25887] Create builder in a way that respects configurable context -val config = new ConfigBuilder() +// Allow for specifying a context used to auto-configure from the users K8S config file +val kubeContext = sparkConf.get(KUBERNETES_CONTEXT).filter(_.nonEmpty) +logInfo(s"Auto-configuring K8S client using " + + s"${if (kubeContext.isDefined) s"context ${kubeContext.getOrElse("?")}" else "current context"}" + --- End diff -- I think using `kubeContext.map("context " + _).getOrElse("current context")` would make this cleaner. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...
Github user rvesse commented on a diff in the pull request: https://github.com/apache/spark/pull/22904#discussion_r238782502 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala --- @@ -67,8 +66,16 @@ private[spark] object SparkKubernetesClientFactory { val dispatcher = new Dispatcher( ThreadUtils.newDaemonCachedThreadPool("kubernetes-dispatcher")) -// TODO [SPARK-25887] Create builder in a way that respects configurable context -val config = new ConfigBuilder() +// Allow for specifying a context used to auto-configure from the users K8S config file +val kubeContext = sparkConf.get(KUBERNETES_CONTEXT).filter(_.nonEmpty) +logInfo(s"Auto-configuring K8S client using " + + s"${if (kubeContext.isDefined) s"context ${kubeContext.getOrElse("?")}" else "current context"}" + --- End diff -- When I was testing this again today I was getting an error from just doing a plain `get` because for some reason the code was taking the first branch even though `kubeContext` was `None` so I changed to using an explicit `isDefined` instead of negating `isEmpty` and added `getOrElse` with a fallback just to avoid that --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/22904#discussion_r238780494 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala --- @@ -20,20 +20,22 @@ import java.io.File import com.google.common.base.Charsets import com.google.common.io.Files +import io.fabric8.kubernetes.client.Config.autoConfigure import io.fabric8.kubernetes.client.{ConfigBuilder, DefaultKubernetesClient, KubernetesClient} import io.fabric8.kubernetes.client.utils.HttpClientUtils import okhttp3.Dispatcher - +import org.apache.commons.lang3.StringUtils --- End diff -- Now unused. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/22904#discussion_r238780744 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala --- @@ -67,8 +66,16 @@ private[spark] object SparkKubernetesClientFactory { val dispatcher = new Dispatcher( ThreadUtils.newDaemonCachedThreadPool("kubernetes-dispatcher")) -// TODO [SPARK-25887] Create builder in a way that respects configurable context -val config = new ConfigBuilder() +// Allow for specifying a context used to auto-configure from the users K8S config file +val kubeContext = sparkConf.get(KUBERNETES_CONTEXT).filter(_.nonEmpty) +logInfo(s"Auto-configuring K8S client using " + + s"${if (kubeContext.isDefined) s"context ${kubeContext.getOrElse("?")}" else "current context"}" + --- End diff -- I saw your commit about the NPE, but I'm not sure how you could get one here? `sparkConf.get` will never return `Some(null)` as far as I know (it would return `None` instead). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...
Github user rvesse commented on a diff in the pull request: https://github.com/apache/spark/pull/22904#discussion_r238779965 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala --- @@ -67,8 +66,16 @@ private[spark] object SparkKubernetesClientFactory { val dispatcher = new Dispatcher( ThreadUtils.newDaemonCachedThreadPool("kubernetes-dispatcher")) -// TODO [SPARK-25887] Create builder in a way that respects configurable context -val config = new ConfigBuilder() +// Allow for specifying a context used to auto-configure from the users K8S config file +val kubeContext = sparkConf.get(KUBERNETES_CONTEXT).filter(c => StringUtils.isNotBlank(c)) +logInfo(s"Auto-configuring K8S client using " + + s"${if (kubeContext.isEmpty) s"context ${kubeContext.get}" else "current context"}" + + s" from users K8S config file") + +// Start from an auto-configured config with the desired context +// Fabric 8 uses null to indicate that the users current context should be used so if no +// explicit setting pass null +val config = new ConfigBuilder(autoConfigure(kubeContext.getOrElse(null))) --- End diff -- Yes and I was referring to the K8S config file :) And yes the fact that we would propagate `spark.kubernetes.context` into the pod shouldn't be an issue because there won't be any K8S config file for it to interact with inside the pod as in-pod K8S config should be from the service account token that gets injected into the pod --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/22904#discussion_r238502012 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala --- @@ -67,8 +66,16 @@ private[spark] object SparkKubernetesClientFactory { val dispatcher = new Dispatcher( ThreadUtils.newDaemonCachedThreadPool("kubernetes-dispatcher")) -// TODO [SPARK-25887] Create builder in a way that respects configurable context -val config = new ConfigBuilder() +// Allow for specifying a context used to auto-configure from the users K8S config file +val kubeContext = sparkConf.get(KUBERNETES_CONTEXT).filter(c => StringUtils.isNotBlank(c)) +logInfo(s"Auto-configuring K8S client using " + + s"${if (kubeContext.isEmpty) s"context ${kubeContext.get}" else "current context"}" + + s" from users K8S config file") + +// Start from an auto-configured config with the desired context +// Fabric 8 uses null to indicate that the users current context should be used so if no +// explicit setting pass null +val config = new ConfigBuilder(autoConfigure(kubeContext.getOrElse(null))) --- End diff -- (Just to clarify I'm referring to the Spark config. Too many config files involved here...) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/22904#discussion_r238484698 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala --- @@ -67,8 +66,16 @@ private[spark] object SparkKubernetesClientFactory { val dispatcher = new Dispatcher( ThreadUtils.newDaemonCachedThreadPool("kubernetes-dispatcher")) -// TODO [SPARK-25887] Create builder in a way that respects configurable context -val config = new ConfigBuilder() +// Allow for specifying a context used to auto-configure from the users K8S config file +val kubeContext = sparkConf.get(KUBERNETES_CONTEXT).filter(c => StringUtils.isNotBlank(c)) +logInfo(s"Auto-configuring K8S client using " + + s"${if (kubeContext.isEmpty) s"context ${kubeContext.get}" else "current context"}" + + s" from users K8S config file") + +// Start from an auto-configured config with the desired context +// Fabric 8 uses null to indicate that the users current context should be used so if no +// explicit setting pass null +val config = new ConfigBuilder(autoConfigure(kubeContext.getOrElse(null))) --- End diff -- > we don't propagate the submission clients config file We don't propagate the config *file* itself, but we do propagate all its contents, as far as I remember. But given your explanation it should work fine, unless there's a different config context with the same name inside the container... --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...
Github user rvesse commented on a diff in the pull request: https://github.com/apache/spark/pull/22904#discussion_r238484145 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala --- @@ -67,8 +66,16 @@ private[spark] object SparkKubernetesClientFactory { val dispatcher = new Dispatcher( ThreadUtils.newDaemonCachedThreadPool("kubernetes-dispatcher")) -// TODO [SPARK-25887] Create builder in a way that respects configurable context -val config = new ConfigBuilder() +// Allow for specifying a context used to auto-configure from the users K8S config file +val kubeContext = sparkConf.get(KUBERNETES_CONTEXT).filter(c => StringUtils.isNotBlank(c)) --- End diff -- Fixed in latest commit --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...
Github user rvesse commented on a diff in the pull request: https://github.com/apache/spark/pull/22904#discussion_r238483901 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala --- @@ -67,8 +66,16 @@ private[spark] object SparkKubernetesClientFactory { val dispatcher = new Dispatcher( ThreadUtils.newDaemonCachedThreadPool("kubernetes-dispatcher")) -// TODO [SPARK-25887] Create builder in a way that respects configurable context -val config = new ConfigBuilder() +// Allow for specifying a context used to auto-configure from the users K8S config file +val kubeContext = sparkConf.get(KUBERNETES_CONTEXT).filter(c => StringUtils.isNotBlank(c)) +logInfo(s"Auto-configuring K8S client using " + + s"${if (kubeContext.isEmpty) s"context ${kubeContext.get}" else "current context"}" + + s" from users K8S config file") + +// Start from an auto-configured config with the desired context +// Fabric 8 uses null to indicate that the users current context should be used so if no +// explicit setting pass null +val config = new ConfigBuilder(autoConfigure(kubeContext.getOrElse(null))) --- End diff -- If the context does not exist then Fabric 8 falls back to other ways of auto-configuring itself (e.g. service account) Fabric 8 skips any file based auto-configuration if there is no K8S config file present (https://github.com/fabric8io/kubernetes-client/blob/master/kubernetes-client/src/main/java/io/fabric8/kubernetes/client/Config.java#L436-L459). Since we don't propagate the submission clients config file into the driver pods no auto-configuration from config file will be attempted in the driver because there won't be a config file present. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/22904#discussion_r238440694 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala --- @@ -67,8 +66,16 @@ private[spark] object SparkKubernetesClientFactory { val dispatcher = new Dispatcher( ThreadUtils.newDaemonCachedThreadPool("kubernetes-dispatcher")) -// TODO [SPARK-25887] Create builder in a way that respects configurable context -val config = new ConfigBuilder() +// Allow for specifying a context used to auto-configure from the users K8S config file +val kubeContext = sparkConf.get(KUBERNETES_CONTEXT).filter(c => StringUtils.isNotBlank(c)) +logInfo(s"Auto-configuring K8S client using " + + s"${if (kubeContext.isEmpty) s"context ${kubeContext.get}" else "current context"}" + + s" from users K8S config file") + +// Start from an auto-configured config with the desired context +// Fabric 8 uses null to indicate that the users current context should be used so if no +// explicit setting pass null +val config = new ConfigBuilder(autoConfigure(kubeContext.getOrElse(null))) --- End diff -- What happens here when the context does not exist? Does it fall back to the default? e.g. in cluster mode, the config you're adding will be propagated to the driver, and then this code will be called with the same context as the submission node. What if that context does not exist inside the driver container? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/22904#discussion_r238439850 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala --- @@ -67,8 +66,16 @@ private[spark] object SparkKubernetesClientFactory { val dispatcher = new Dispatcher( ThreadUtils.newDaemonCachedThreadPool("kubernetes-dispatcher")) -// TODO [SPARK-25887] Create builder in a way that respects configurable context -val config = new ConfigBuilder() +// Allow for specifying a context used to auto-configure from the users K8S config file +val kubeContext = sparkConf.get(KUBERNETES_CONTEXT).filter(c => StringUtils.isNotBlank(c)) --- End diff -- Either `.filter { c => ... }` or `.filter(StringUtils.isNotBlank)`. But really you can skip the extra dependency (`.filter(_.nonEmpty)`). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...
Github user rvesse commented on a diff in the pull request: https://github.com/apache/spark/pull/22904#discussion_r237447038 --- Diff: docs/running-on-kubernetes.md --- @@ -298,6 +298,16 @@ the Spark application. ## Kubernetes Features +### Configuration File + +Your Kubernetes config file typically lives under `.kube/config` in your home directory or in a location specified by the `KUBECONFIG` environment variable. Spark on Kubernetes will attempt to use this file to do an initial auto-configuration of the Kubernetes client used to interact with the Kubernetes cluster. A variety of Spark configuration properties are provided that allow further customising the client configuration e.g. using an alternative authentication method. --- End diff -- To be frank I'm not really sure why there are different config options for client vs cluster mode and this may have changed with some of the cleanup that @vanzin has been doing lately to simplify the configuration code. Personally I have never needed to use any of the additional configuration properties in either client/cluster mode as the auto-configuration from my K8S config file has always been sufficient. At worst I've needed to set `KUBECONFIG` to select the correct config file for the cluster I want to submit to. Note that the core behaviour (the auto-configuration) has always existed implicitly in the K8S backend but was just not called out explicitly previously in the docs. This PR primarily just makes it more explicit and flexible for users who have multiple contexts in their config files. WRT `spark.master` Spark in general requires that to always be set and will use that to override whatever is present in the K8S config file regardless. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...
Github user aditanase commented on a diff in the pull request: https://github.com/apache/spark/pull/22904#discussion_r236663827 --- Diff: docs/running-on-kubernetes.md --- @@ -298,6 +298,16 @@ the Spark application. ## Kubernetes Features +### Configuration File + +Your Kubernetes config file typically lives under `.kube/config` in your home directory or in a location specified by the `KUBECONFIG` environment variable. Spark on Kubernetes will attempt to use this file to do an initial auto-configuration of the Kubernetes client used to interact with the Kubernetes cluster. A variety of Spark configuration properties are provided that allow further customising the client configuration e.g. using an alternative authentication method. --- End diff -- I would add more details about what configuration we can reuse from the context. For example, would `spark.master` be taken from kubernetes master? I'm trying to understand what's the net new benefit for somebody using this in their day-to-day work. Also, now that we have client mode, I anticipate many users switching to that to emulate a typical Mesos deployment (e.g. driver in client mode, kept alive by Marathon). I am assuming this does not apply in client mode, where all the options need to be explicitly specified. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...
Github user rvesse commented on a diff in the pull request: https://github.com/apache/spark/pull/22904#discussion_r230318413 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala --- @@ -23,6 +23,18 @@ import org.apache.spark.internal.config.ConfigBuilder private[spark] object Config extends Logging { + val KUBERNETES_CONTEXT = +ConfigBuilder("spark.kubernetes.context") + .doc("The desired context from your K8S config file used to configure the K8S " + +"client for interacting with the cluster. Useful if your config file has " + +"multiple clusters or user identities defined. The client library used " + +"locates the config file via the KUBECONFIG environment variable or by defaulting " + +"to .kube/config under your home directory. If not specified then your current " + +"context is used. You can always override specific aspects of the config file " + +"provided configuration using other Spark on K8S configuration options.") + .stringConf + .createWithDefault(null) --- End diff -- Refactored to use `Option` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...
Github user mccheah commented on a diff in the pull request: https://github.com/apache/spark/pull/22904#discussion_r230127363 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala --- @@ -23,6 +23,18 @@ import org.apache.spark.internal.config.ConfigBuilder private[spark] object Config extends Logging { + val KUBERNETES_CONTEXT = +ConfigBuilder("spark.kubernetes.context") + .doc("The desired context from your K8S config file used to configure the K8S " + +"client for interacting with the cluster. Useful if your config file has " + +"multiple clusters or user identities defined. The client library used " + +"locates the config file via the KUBECONFIG environment variable or by defaulting " + +"to .kube/config under your home directory. If not specified then your current " + +"context is used. You can always override specific aspects of the config file " + +"provided configuration using other Spark on K8S configuration options.") + .stringConf + .createWithDefault(null) --- End diff -- Default of `null` seems strange, surely we can make this `Optional` and just handle `empty` in the places that use this? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...
GitHub user rvesse opened a pull request: https://github.com/apache/spark/pull/22904 [SPARK-25887][K8S] Configurable K8S context support ## What changes were proposed in this pull request? This enhancement allows for specifying the desired context to use for the initial K8S client auto-configuration. This allows users to more easily access alternative K8S contexts without having to first explicitly change their current context via kubectl. ## How was this patch tested? Explicitly set my K8S context to a context pointing to a non-existent cluster, then launched Spark jobs with explicitly specified contexts via the new `spark.kubernetes.context` configuration property. Example Output: ``` > kubectl config current-context minikube > minikube status minikube: Stopped cluster: kubectl: > ./spark-submit --master k8s://https://localhost:6443 --deploy-mode cluster --name spark-pi --class org.apache.spark.examples.SparkPi --conf spark.executor.instances=2 --conf spark.kubernetes.context=docker-for-desktop --conf spark.kubernetes.container.image=rvesse/spark:debian local:///opt/spark/examples/jars/spark-examples_2.11-3.0.0-SNAPSHOT.jar 4 18/10/31 11:57:51 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 18/10/31 11:57:51 INFO SparkKubernetesClientFactory: Auto-configuring K8S client using context docker-for-desktop from users K8S config file 18/10/31 11:57:52 INFO LoggingPodStatusWatcherImpl: State changed, new state: pod name: spark-pi-1540987071845-driver namespace: default labels: spark-app-selector -> spark-2c4abc226ed3415986eb602bd13f3582, spark-role -> driver pod uid: 32462cac-dd04-11e8-b6c6-0251 creation time: 2018-10-31T11:57:52Z service account name: default volumes: spark-local-dir-1, spark-conf-volume, default-token-glpfv node name: N/A start time: N/A phase: Pending container status: N/A 18/10/31 11:57:52 INFO LoggingPodStatusWatcherImpl: State changed, new state: pod name: spark-pi-1540987071845-driver namespace: default labels: spark-app-selector -> spark-2c4abc226ed3415986eb602bd13f3582, spark-role -> driver pod uid: 32462cac-dd04-11e8-b6c6-0251 creation time: 2018-10-31T11:57:52Z service account name: default volumes: spark-local-dir-1, spark-conf-volume, default-token-glpfv node name: docker-for-desktop start time: N/A phase: Pending container status: N/A ... 18/10/31 11:58:03 INFO LoggingPodStatusWatcherImpl: State changed, new state: pod name: spark-pi-1540987071845-driver namespace: default labels: spark-app-selector -> spark-2c4abc226ed3415986eb602bd13f3582, spark-role -> driver pod uid: 32462cac-dd04-11e8-b6c6-0251 creation time: 2018-10-31T11:57:52Z service account name: default volumes: spark-local-dir-1, spark-conf-volume, default-token-glpfv node name: docker-for-desktop start time: 2018-10-31T11:57:52Z phase: Succeeded container status: container name: spark-kubernetes-driver container image: rvesse/spark:debian container state: terminated container started at: 2018-10-31T11:57:54Z container finished at: 2018-10-31T11:58:02Z exit code: 0 termination reason: Completed ``` Without the `spark.kubernetes.context` setting this will fail because the current context - `minikube` - is pointing to a non-running cluster e.g. ``` > ./spark-submit --master k8s://https://localhost:6443 --deploy-mode cluster --name spark-pi --class org.apache.spark.examples.SparkPi --conf spark.executor.instances=2 --conf spark.kubernetes.container.image=rvesse/spark:debian local:///opt/spark/examples/jars/spark-examples_2.11-3.0.0-SNAPSHOT.jar 4 18/10/31 12:02:30 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 18/10/31 12:02:30 INFO SparkKubernetesClientFactory: Auto-configuring K8S client using current context from users K8S config file 18/10/31 12:02:31 WARN WatchConnectionManager: Exec Failure javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target at sun.security.ssl.Alerts.getSSLException(Alerts.java:192) at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1949) at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:302) at