[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...

2018-12-10 Thread aditanase
Github user aditanase commented on a diff in the pull request:

https://github.com/apache/spark/pull/22904#discussion_r240141774
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala
 ---
@@ -67,8 +66,16 @@ private[spark] object SparkKubernetesClientFactory {
 val dispatcher = new Dispatcher(
   ThreadUtils.newDaemonCachedThreadPool("kubernetes-dispatcher"))
 
-// TODO [SPARK-25887] Create builder in a way that respects 
configurable context
-val config = new ConfigBuilder()
+// Allow for specifying a context used to auto-configure from the 
users K8S config file
+val kubeContext = sparkConf.get(KUBERNETES_CONTEXT).filter(c => 
StringUtils.isNotBlank(c))
+logInfo(s"Auto-configuring K8S client using " +
+  s"${if (kubeContext.isEmpty) s"context ${kubeContext.get}" else 
"current context"}" +
+  s" from users K8S config file")
+
+// Start from an auto-configured config with the desired context
+// Fabric 8 uses null to indicate that the users current context 
should be used so if no
+// explicit setting pass null
+val config = new 
ConfigBuilder(autoConfigure(kubeContext.getOrElse(null)))
--- End diff --

Ok, this is starting to sink in. I just realized that in my deployment 
scenario, all I need to do is to specify the `serviceAccountName` on _my_ pod 
template that contains the driver.

That leaves me confused though - what other "client mode deployments" do 
you see happening on K8S? What does client mode mean to you?
Also - how should one interpret this paragraph in the docs?
` In client mode, use spark.kubernetes.authenticate.serviceAccountName 
instead.`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...

2018-12-07 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/22904#discussion_r239903144
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala
 ---
@@ -67,8 +66,16 @@ private[spark] object SparkKubernetesClientFactory {
 val dispatcher = new Dispatcher(
   ThreadUtils.newDaemonCachedThreadPool("kubernetes-dispatcher"))
 
-// TODO [SPARK-25887] Create builder in a way that respects 
configurable context
-val config = new ConfigBuilder()
+// Allow for specifying a context used to auto-configure from the 
users K8S config file
+val kubeContext = sparkConf.get(KUBERNETES_CONTEXT).filter(c => 
StringUtils.isNotBlank(c))
+logInfo(s"Auto-configuring K8S client using " +
+  s"${if (kubeContext.isEmpty) s"context ${kubeContext.get}" else 
"current context"}" +
+  s" from users K8S config file")
+
+// Start from an auto-configured config with the desired context
+// Fabric 8 uses null to indicate that the users current context 
should be used so if no
+// explicit setting pass null
+val config = new 
ConfigBuilder(autoConfigure(kubeContext.getOrElse(null)))
--- End diff --

> I think this enhancement does not apply to client mode

If you mean "client mode inside a k8s-managed docker container", then yes, 
you may need to do extra stuff, like mount the appropriate credentials. But in 
the "client mode with driver inside k8s pod" case, Spark does not create that 
pod for you. So I'm not sure how Spark can help with anything there; the 
`serviceName` configuration seems targeted at propagating the credentials of 
the submitter to the driver pod, and in that case Spark is not creating the 
driver pod at all.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...

2018-12-07 Thread aditanase
Github user aditanase commented on a diff in the pull request:

https://github.com/apache/spark/pull/22904#discussion_r239732258
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala
 ---
@@ -67,8 +66,16 @@ private[spark] object SparkKubernetesClientFactory {
 val dispatcher = new Dispatcher(
   ThreadUtils.newDaemonCachedThreadPool("kubernetes-dispatcher"))
 
-// TODO [SPARK-25887] Create builder in a way that respects 
configurable context
-val config = new ConfigBuilder()
+// Allow for specifying a context used to auto-configure from the 
users K8S config file
+val kubeContext = sparkConf.get(KUBERNETES_CONTEXT).filter(c => 
StringUtils.isNotBlank(c))
+logInfo(s"Auto-configuring K8S client using " +
+  s"${if (kubeContext.isEmpty) s"context ${kubeContext.get}" else 
"current context"}" +
+  s" from users K8S config file")
+
+// Start from an auto-configured config with the desired context
+// Fabric 8 uses null to indicate that the users current context 
should be used so if no
+// explicit setting pass null
+val config = new 
ConfigBuilder(autoConfigure(kubeContext.getOrElse(null)))
--- End diff --

This somewhat answers my previous question, and it makes sense. Seems like 
Fabric8 is behaving consistently with other SDKs (e.g. AWS comes to mind as it 
will try 3-4 default configuration options, including env vars, iam machine 
role, etc).

That being said, if you were to deploy an app inside a docker container, it 
won't see any of those and one still needs to inject some env vars or config 
maps for the AWS SDK to pick up.

Long story short, I think this enhancement does not apply to client mode, 
and we should make that clearer in the docs, while providing alternatives for 
issues like https://issues.apache.org/jira/browse/SPARK-26295.

Hopefully I'm making sense.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...

2018-12-07 Thread aditanase
Github user aditanase commented on a diff in the pull request:

https://github.com/apache/spark/pull/22904#discussion_r239730534
  
--- Diff: docs/running-on-kubernetes.md ---
@@ -298,6 +298,16 @@ the Spark application.
 
 ## Kubernetes Features
 
+### Configuration File
+
+Your Kubernetes config file typically lives under `.kube/config` in your 
home directory or in a location specified by the `KUBECONFIG` environment 
variable.  Spark on Kubernetes will attempt to use this file to do an initial 
auto-configuration of the Kubernetes client used to interact with the 
Kubernetes cluster.  A variety of Spark configuration properties are provided 
that allow further customising the client configuration e.g. using an 
alternative authentication method.
--- End diff --

Ok, I just opened https://issues.apache.org/jira/browse/SPARK-26295 and 
@vanzin redirected me to this thread. Would love your eyes on that issue, see 
if we can use your work here to close that too.

In short, If there is code that propagates the kube context along this 
path, I'm not aware of it, would love to see some documentation:
```
laptop with kubectl and context > k apply -f spark-driver-client-mode.yaml 
-> deployment starts 1 instance of driver pod in arbitrary namespace -> spark 
submit from start.sh inside the docker container -> ... 
```
There is no kubectl or "kube context" in the docker container, it's just 
the spark distro and my jars. So where would the driver pod get the account 
from?

PS: agreed that there are too many config options on the auth side, maybe 
we could consolidate them more.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...

2018-12-05 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/22904#discussion_r239272037
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala
 ---
@@ -67,8 +66,16 @@ private[spark] object SparkKubernetesClientFactory {
 val dispatcher = new Dispatcher(
   ThreadUtils.newDaemonCachedThreadPool("kubernetes-dispatcher"))
 
-// TODO [SPARK-25887] Create builder in a way that respects 
configurable context
-val config = new ConfigBuilder()
+// Allow for specifying a context used to auto-configure from the 
users K8S config file
+val kubeContext = sparkConf.get(KUBERNETES_CONTEXT).filter(_.nonEmpty)
+logInfo(s"Auto-configuring K8S client using " +
+  s"${if (kubeContext.isDefined) s"context 
${kubeContext.getOrElse("?")}" else "current context"}" +
--- End diff --

I think using `kubeContext.map("context " + _).getOrElse("current 
context")` would make this cleaner.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...

2018-12-04 Thread rvesse
Github user rvesse commented on a diff in the pull request:

https://github.com/apache/spark/pull/22904#discussion_r238782502
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala
 ---
@@ -67,8 +66,16 @@ private[spark] object SparkKubernetesClientFactory {
 val dispatcher = new Dispatcher(
   ThreadUtils.newDaemonCachedThreadPool("kubernetes-dispatcher"))
 
-// TODO [SPARK-25887] Create builder in a way that respects 
configurable context
-val config = new ConfigBuilder()
+// Allow for specifying a context used to auto-configure from the 
users K8S config file
+val kubeContext = sparkConf.get(KUBERNETES_CONTEXT).filter(_.nonEmpty)
+logInfo(s"Auto-configuring K8S client using " +
+  s"${if (kubeContext.isDefined) s"context 
${kubeContext.getOrElse("?")}" else "current context"}" +
--- End diff --

When I was testing this again today I was getting an error from just doing 
a plain `get` because for some reason the code was taking the first branch even 
though `kubeContext` was `None` so I changed to using an explicit `isDefined` 
instead of negating `isEmpty` and added `getOrElse` with a fallback just to 
avoid that


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...

2018-12-04 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/22904#discussion_r238780494
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala
 ---
@@ -20,20 +20,22 @@ import java.io.File
 
 import com.google.common.base.Charsets
 import com.google.common.io.Files
+import io.fabric8.kubernetes.client.Config.autoConfigure
 import io.fabric8.kubernetes.client.{ConfigBuilder, 
DefaultKubernetesClient, KubernetesClient}
 import io.fabric8.kubernetes.client.utils.HttpClientUtils
 import okhttp3.Dispatcher
-
+import org.apache.commons.lang3.StringUtils
--- End diff --

Now unused.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...

2018-12-04 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/22904#discussion_r238780744
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala
 ---
@@ -67,8 +66,16 @@ private[spark] object SparkKubernetesClientFactory {
 val dispatcher = new Dispatcher(
   ThreadUtils.newDaemonCachedThreadPool("kubernetes-dispatcher"))
 
-// TODO [SPARK-25887] Create builder in a way that respects 
configurable context
-val config = new ConfigBuilder()
+// Allow for specifying a context used to auto-configure from the 
users K8S config file
+val kubeContext = sparkConf.get(KUBERNETES_CONTEXT).filter(_.nonEmpty)
+logInfo(s"Auto-configuring K8S client using " +
+  s"${if (kubeContext.isDefined) s"context 
${kubeContext.getOrElse("?")}" else "current context"}" +
--- End diff --

I saw your commit about the NPE, but I'm not sure how you could get one 
here?

`sparkConf.get` will never return `Some(null)` as far as I know (it would 
return `None` instead).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...

2018-12-04 Thread rvesse
Github user rvesse commented on a diff in the pull request:

https://github.com/apache/spark/pull/22904#discussion_r238779965
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala
 ---
@@ -67,8 +66,16 @@ private[spark] object SparkKubernetesClientFactory {
 val dispatcher = new Dispatcher(
   ThreadUtils.newDaemonCachedThreadPool("kubernetes-dispatcher"))
 
-// TODO [SPARK-25887] Create builder in a way that respects 
configurable context
-val config = new ConfigBuilder()
+// Allow for specifying a context used to auto-configure from the 
users K8S config file
+val kubeContext = sparkConf.get(KUBERNETES_CONTEXT).filter(c => 
StringUtils.isNotBlank(c))
+logInfo(s"Auto-configuring K8S client using " +
+  s"${if (kubeContext.isEmpty) s"context ${kubeContext.get}" else 
"current context"}" +
+  s" from users K8S config file")
+
+// Start from an auto-configured config with the desired context
+// Fabric 8 uses null to indicate that the users current context 
should be used so if no
+// explicit setting pass null
+val config = new 
ConfigBuilder(autoConfigure(kubeContext.getOrElse(null)))
--- End diff --

Yes and I was referring to the K8S config file :) And yes the fact that we 
would propagate `spark.kubernetes.context` into the pod shouldn't be an issue 
because there won't be any K8S config file for it to interact with inside the 
pod as in-pod K8S config should be from the service account token that gets 
injected into the pod


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...

2018-12-03 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/22904#discussion_r238502012
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala
 ---
@@ -67,8 +66,16 @@ private[spark] object SparkKubernetesClientFactory {
 val dispatcher = new Dispatcher(
   ThreadUtils.newDaemonCachedThreadPool("kubernetes-dispatcher"))
 
-// TODO [SPARK-25887] Create builder in a way that respects 
configurable context
-val config = new ConfigBuilder()
+// Allow for specifying a context used to auto-configure from the 
users K8S config file
+val kubeContext = sparkConf.get(KUBERNETES_CONTEXT).filter(c => 
StringUtils.isNotBlank(c))
+logInfo(s"Auto-configuring K8S client using " +
+  s"${if (kubeContext.isEmpty) s"context ${kubeContext.get}" else 
"current context"}" +
+  s" from users K8S config file")
+
+// Start from an auto-configured config with the desired context
+// Fabric 8 uses null to indicate that the users current context 
should be used so if no
+// explicit setting pass null
+val config = new 
ConfigBuilder(autoConfigure(kubeContext.getOrElse(null)))
--- End diff --

(Just to clarify I'm referring to the Spark config. Too many config files 
involved here...)


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...

2018-12-03 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/22904#discussion_r238484698
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala
 ---
@@ -67,8 +66,16 @@ private[spark] object SparkKubernetesClientFactory {
 val dispatcher = new Dispatcher(
   ThreadUtils.newDaemonCachedThreadPool("kubernetes-dispatcher"))
 
-// TODO [SPARK-25887] Create builder in a way that respects 
configurable context
-val config = new ConfigBuilder()
+// Allow for specifying a context used to auto-configure from the 
users K8S config file
+val kubeContext = sparkConf.get(KUBERNETES_CONTEXT).filter(c => 
StringUtils.isNotBlank(c))
+logInfo(s"Auto-configuring K8S client using " +
+  s"${if (kubeContext.isEmpty) s"context ${kubeContext.get}" else 
"current context"}" +
+  s" from users K8S config file")
+
+// Start from an auto-configured config with the desired context
+// Fabric 8 uses null to indicate that the users current context 
should be used so if no
+// explicit setting pass null
+val config = new 
ConfigBuilder(autoConfigure(kubeContext.getOrElse(null)))
--- End diff --

> we don't propagate the submission clients config file

We don't propagate the config *file* itself, but we do propagate all its 
contents, as far as I remember. But given your explanation it should work fine, 
unless there's a different config context with the same name inside the 
container...


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...

2018-12-03 Thread rvesse
Github user rvesse commented on a diff in the pull request:

https://github.com/apache/spark/pull/22904#discussion_r238484145
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala
 ---
@@ -67,8 +66,16 @@ private[spark] object SparkKubernetesClientFactory {
 val dispatcher = new Dispatcher(
   ThreadUtils.newDaemonCachedThreadPool("kubernetes-dispatcher"))
 
-// TODO [SPARK-25887] Create builder in a way that respects 
configurable context
-val config = new ConfigBuilder()
+// Allow for specifying a context used to auto-configure from the 
users K8S config file
+val kubeContext = sparkConf.get(KUBERNETES_CONTEXT).filter(c => 
StringUtils.isNotBlank(c))
--- End diff --

Fixed in latest commit


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...

2018-12-03 Thread rvesse
Github user rvesse commented on a diff in the pull request:

https://github.com/apache/spark/pull/22904#discussion_r238483901
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala
 ---
@@ -67,8 +66,16 @@ private[spark] object SparkKubernetesClientFactory {
 val dispatcher = new Dispatcher(
   ThreadUtils.newDaemonCachedThreadPool("kubernetes-dispatcher"))
 
-// TODO [SPARK-25887] Create builder in a way that respects 
configurable context
-val config = new ConfigBuilder()
+// Allow for specifying a context used to auto-configure from the 
users K8S config file
+val kubeContext = sparkConf.get(KUBERNETES_CONTEXT).filter(c => 
StringUtils.isNotBlank(c))
+logInfo(s"Auto-configuring K8S client using " +
+  s"${if (kubeContext.isEmpty) s"context ${kubeContext.get}" else 
"current context"}" +
+  s" from users K8S config file")
+
+// Start from an auto-configured config with the desired context
+// Fabric 8 uses null to indicate that the users current context 
should be used so if no
+// explicit setting pass null
+val config = new 
ConfigBuilder(autoConfigure(kubeContext.getOrElse(null)))
--- End diff --

If the context does not exist then Fabric 8 falls back to other ways of 
auto-configuring itself (e.g. service account)

Fabric 8 skips any file based auto-configuration if there is no K8S config 
file present 
(https://github.com/fabric8io/kubernetes-client/blob/master/kubernetes-client/src/main/java/io/fabric8/kubernetes/client/Config.java#L436-L459).

Since we don't propagate the submission clients config file into the driver 
pods no auto-configuration from config file will be attempted in the driver 
because there won't be a config file present.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...

2018-12-03 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/22904#discussion_r238440694
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala
 ---
@@ -67,8 +66,16 @@ private[spark] object SparkKubernetesClientFactory {
 val dispatcher = new Dispatcher(
   ThreadUtils.newDaemonCachedThreadPool("kubernetes-dispatcher"))
 
-// TODO [SPARK-25887] Create builder in a way that respects 
configurable context
-val config = new ConfigBuilder()
+// Allow for specifying a context used to auto-configure from the 
users K8S config file
+val kubeContext = sparkConf.get(KUBERNETES_CONTEXT).filter(c => 
StringUtils.isNotBlank(c))
+logInfo(s"Auto-configuring K8S client using " +
+  s"${if (kubeContext.isEmpty) s"context ${kubeContext.get}" else 
"current context"}" +
+  s" from users K8S config file")
+
+// Start from an auto-configured config with the desired context
+// Fabric 8 uses null to indicate that the users current context 
should be used so if no
+// explicit setting pass null
+val config = new 
ConfigBuilder(autoConfigure(kubeContext.getOrElse(null)))
--- End diff --

What happens here when the context does not exist? Does it fall back to the 
default?

e.g. in cluster mode, the config you're adding will be propagated to the 
driver, and then this code will be called with the same context as the 
submission node. What if that context does not exist inside the driver 
container?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...

2018-12-03 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/22904#discussion_r238439850
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala
 ---
@@ -67,8 +66,16 @@ private[spark] object SparkKubernetesClientFactory {
 val dispatcher = new Dispatcher(
   ThreadUtils.newDaemonCachedThreadPool("kubernetes-dispatcher"))
 
-// TODO [SPARK-25887] Create builder in a way that respects 
configurable context
-val config = new ConfigBuilder()
+// Allow for specifying a context used to auto-configure from the 
users K8S config file
+val kubeContext = sparkConf.get(KUBERNETES_CONTEXT).filter(c => 
StringUtils.isNotBlank(c))
--- End diff --

Either `.filter { c => ... }` or `.filter(StringUtils.isNotBlank)`. But 
really you can skip the extra dependency (`.filter(_.nonEmpty)`).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...

2018-11-29 Thread rvesse
Github user rvesse commented on a diff in the pull request:

https://github.com/apache/spark/pull/22904#discussion_r237447038
  
--- Diff: docs/running-on-kubernetes.md ---
@@ -298,6 +298,16 @@ the Spark application.
 
 ## Kubernetes Features
 
+### Configuration File
+
+Your Kubernetes config file typically lives under `.kube/config` in your 
home directory or in a location specified by the `KUBECONFIG` environment 
variable.  Spark on Kubernetes will attempt to use this file to do an initial 
auto-configuration of the Kubernetes client used to interact with the 
Kubernetes cluster.  A variety of Spark configuration properties are provided 
that allow further customising the client configuration e.g. using an 
alternative authentication method.
--- End diff --

To be frank I'm not really sure why there are different config options for 
client vs cluster mode and this may have changed with some of the cleanup that 
@vanzin has been doing lately to simplify the configuration code.

Personally I have never needed to use any of the additional configuration 
properties in either client/cluster mode as the auto-configuration from my K8S 
config file has always been sufficient.  At worst I've needed to set 
`KUBECONFIG` to select the correct config file for the cluster I want to submit 
to.

Note that the core behaviour (the auto-configuration) has always existed 
implicitly in the K8S backend but was just not called out explicitly previously 
in the docs.  This PR primarily just makes it more explicit and flexible for 
users who have multiple contexts in their config files.

WRT `spark.master` Spark in general requires that to always be set and will 
use that to override whatever is present in the K8S config file regardless.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...

2018-11-27 Thread aditanase
Github user aditanase commented on a diff in the pull request:

https://github.com/apache/spark/pull/22904#discussion_r236663827
  
--- Diff: docs/running-on-kubernetes.md ---
@@ -298,6 +298,16 @@ the Spark application.
 
 ## Kubernetes Features
 
+### Configuration File
+
+Your Kubernetes config file typically lives under `.kube/config` in your 
home directory or in a location specified by the `KUBECONFIG` environment 
variable.  Spark on Kubernetes will attempt to use this file to do an initial 
auto-configuration of the Kubernetes client used to interact with the 
Kubernetes cluster.  A variety of Spark configuration properties are provided 
that allow further customising the client configuration e.g. using an 
alternative authentication method.
--- End diff --

I would add more details about what configuration we can reuse from the 
context. For example, would `spark.master` be taken from kubernetes master? I'm 
trying to understand what's the net new benefit for somebody using this in 
their day-to-day work.

Also, now that we have client mode, I anticipate many users switching to 
that to emulate a typical Mesos deployment (e.g. driver in client mode, kept 
alive by Marathon). I am assuming this does not apply in client mode, where all 
the options need to be explicitly specified.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...

2018-11-02 Thread rvesse
Github user rvesse commented on a diff in the pull request:

https://github.com/apache/spark/pull/22904#discussion_r230318413
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala
 ---
@@ -23,6 +23,18 @@ import org.apache.spark.internal.config.ConfigBuilder
 
 private[spark] object Config extends Logging {
 
+  val KUBERNETES_CONTEXT =
+ConfigBuilder("spark.kubernetes.context")
+  .doc("The desired context from your K8S config file used to 
configure the K8S " +
+"client for interacting with the cluster.  Useful if your config 
file has " +
+"multiple clusters or user identities defined.  The client library 
used " +
+"locates the config file via the KUBECONFIG environment variable 
or by defaulting " +
+"to .kube/config under your home directory.  If not specified then 
your current " +
+"context is used.  You can always override specific aspects of the 
config file " +
+"provided configuration using other Spark on K8S configuration 
options.")
+  .stringConf
+  .createWithDefault(null)
--- End diff --

Refactored to use `Option`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...

2018-11-01 Thread mccheah
Github user mccheah commented on a diff in the pull request:

https://github.com/apache/spark/pull/22904#discussion_r230127363
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala
 ---
@@ -23,6 +23,18 @@ import org.apache.spark.internal.config.ConfigBuilder
 
 private[spark] object Config extends Logging {
 
+  val KUBERNETES_CONTEXT =
+ConfigBuilder("spark.kubernetes.context")
+  .doc("The desired context from your K8S config file used to 
configure the K8S " +
+"client for interacting with the cluster.  Useful if your config 
file has " +
+"multiple clusters or user identities defined.  The client library 
used " +
+"locates the config file via the KUBECONFIG environment variable 
or by defaulting " +
+"to .kube/config under your home directory.  If not specified then 
your current " +
+"context is used.  You can always override specific aspects of the 
config file " +
+"provided configuration using other Spark on K8S configuration 
options.")
+  .stringConf
+  .createWithDefault(null)
--- End diff --

Default of `null` seems strange, surely we can make this `Optional` and 
just handle `empty` in the places that use this?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...

2018-10-31 Thread rvesse
GitHub user rvesse opened a pull request:

https://github.com/apache/spark/pull/22904

[SPARK-25887][K8S] Configurable K8S context support

## What changes were proposed in this pull request?

This enhancement allows for specifying the desired context to use for
the initial K8S client auto-configuration.  This allows users to more
easily access alternative K8S contexts without having to first
explicitly change their current context via kubectl.

## How was this patch tested?

Explicitly set my K8S context to a context pointing to a non-existent 
cluster, then launched Spark jobs with explicitly specified contexts via the 
new `spark.kubernetes.context` configuration property.

Example Output:

```
> kubectl config current-context
minikube
> minikube status
minikube: Stopped
cluster: 
kubectl: 
> ./spark-submit --master k8s://https://localhost:6443 --deploy-mode 
cluster --name spark-pi --class org.apache.spark.examples.SparkPi --conf 
spark.executor.instances=2 --conf spark.kubernetes.context=docker-for-desktop 
--conf spark.kubernetes.container.image=rvesse/spark:debian 
local:///opt/spark/examples/jars/spark-examples_2.11-3.0.0-SNAPSHOT.jar 4
18/10/31 11:57:51 WARN NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
18/10/31 11:57:51 INFO SparkKubernetesClientFactory: Auto-configuring K8S 
client using context docker-for-desktop from users K8S config file
18/10/31 11:57:52 INFO LoggingPodStatusWatcherImpl: State changed, new 
state: 
 pod name: spark-pi-1540987071845-driver
 namespace: default
 labels: spark-app-selector -> spark-2c4abc226ed3415986eb602bd13f3582, 
spark-role -> driver
 pod uid: 32462cac-dd04-11e8-b6c6-0251
 creation time: 2018-10-31T11:57:52Z
 service account name: default
 volumes: spark-local-dir-1, spark-conf-volume, default-token-glpfv
 node name: N/A
 start time: N/A
 phase: Pending
 container status: N/A
18/10/31 11:57:52 INFO LoggingPodStatusWatcherImpl: State changed, new 
state: 
 pod name: spark-pi-1540987071845-driver
 namespace: default
 labels: spark-app-selector -> spark-2c4abc226ed3415986eb602bd13f3582, 
spark-role -> driver
 pod uid: 32462cac-dd04-11e8-b6c6-0251
 creation time: 2018-10-31T11:57:52Z
 service account name: default
 volumes: spark-local-dir-1, spark-conf-volume, default-token-glpfv
 node name: docker-for-desktop
 start time: N/A
 phase: Pending
 container status: N/A
...
18/10/31 11:58:03 INFO LoggingPodStatusWatcherImpl: State changed, new 
state: 
 pod name: spark-pi-1540987071845-driver
 namespace: default
 labels: spark-app-selector -> spark-2c4abc226ed3415986eb602bd13f3582, 
spark-role -> driver
 pod uid: 32462cac-dd04-11e8-b6c6-0251
 creation time: 2018-10-31T11:57:52Z
 service account name: default
 volumes: spark-local-dir-1, spark-conf-volume, default-token-glpfv
 node name: docker-for-desktop
 start time: 2018-10-31T11:57:52Z
 phase: Succeeded
 container status: 
 container name: spark-kubernetes-driver
 container image: rvesse/spark:debian
 container state: terminated
 container started at: 2018-10-31T11:57:54Z
 container finished at: 2018-10-31T11:58:02Z
 exit code: 0
 termination reason: Completed
```

Without the `spark.kubernetes.context` setting this will fail because the 
current context - `minikube` - is pointing to a non-running cluster e.g.

```
> ./spark-submit --master k8s://https://localhost:6443 --deploy-mode 
cluster --name spark-pi --class org.apache.spark.examples.SparkPi --conf 
spark.executor.instances=2 --conf 
spark.kubernetes.container.image=rvesse/spark:debian 
local:///opt/spark/examples/jars/spark-examples_2.11-3.0.0-SNAPSHOT.jar 4
18/10/31 12:02:30 WARN NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
18/10/31 12:02:30 INFO SparkKubernetesClientFactory: Auto-configuring K8S 
client using current context from users K8S config file
18/10/31 12:02:31 WARN WatchConnectionManager: Exec Failure
javax.net.ssl.SSLHandshakeException: 
sun.security.validator.ValidatorException: PKIX path building failed: 
sun.security.provider.certpath.SunCertPathBuilderException: unable to find 
valid certification path to requested target
at sun.security.ssl.Alerts.getSSLException(Alerts.java:192)
at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1949)
at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:302)
at