[GitHub] spark issue #22433: [SPARK-25442][SQL][K8S] Support STS to run in k8s deploy...

2018-10-21 Thread suryag10
Github user suryag10 commented on the issue:

https://github.com/apache/spark/pull/22433
  
I am observing some weird behaviour when i am trying to respond to the 
comments. Hence i am adding the resposes to comments as below.
Following are the responses for the comments:
>>The script may be run from a client machine outside a k8s cluster. In 
this case, there's not even a pod. I would suggest separating the explanation 
of the user flow details by the deploy mode (client vs cluster).

 STS is a server and its best way of deployment in K8S cluster is either 
done through the helm chart or through the yaml file(although it can be done 
through the method you had suggested, but i guess that scenario would be a rare 
case and there will be no HA of the STS server if it is triggered from outside).

>> In the scenario of a cluster-mode submission, what is the command-line 
behavior? Does the thrift-server script "block" until the thrift server pod is 
shut down?

By default the script returns but can be made to block by setting the 
environment variable SPARK_NO_DAEMONIZE. Once this is done, script blocks until 
the thrift server pod is shut down

>>If possible, there should be some basic integration testing. Run a thrift 
server command against the minishift cluster used by the other testing.

Will add it as a separate PR.

Pls merge this, if you are ok with the responses.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22433: [SPARK-25442][SQL][K8S] Support STS to run in k8s deploy...

2018-10-21 Thread suryag10
Github user suryag10 commented on the issue:

https://github.com/apache/spark/pull/22433
  
Can some body pls merge this?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22433: [SPARK-25442][SQL][K8S] Support STS to run in k8s...

2018-10-21 Thread suryag10
GitHub user suryag10 reopened a pull request:

https://github.com/apache/spark/pull/22433

[SPARK-25442][SQL][K8S] Support STS to run in k8s deployments with spark 
deployment mode as cluster.

## What changes were proposed in this pull request?

Code is enhanced to allow the STS run in kubernetes deployment with spark 
deploy mode of cluster.

  

## How was this patch tested?

Started the sts in cluster mode in K8S deployment and was able to run some 
queries using the beeline client.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/suryag10/spark master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/22433.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #22433


commit 3a7fa571181e4b0494f2b705fbd07bc61b0ca6ce
Author: Suryanarayana GARLAPATI 
Date:   2018-09-16T04:37:26Z

Support STS to run in k8s cluster mode

commit 3556a61241e3f4910673f1bbf701905870ed09ea
Author: Suryanarayana GARLAPATI 
Date:   2018-09-16T04:37:26Z

[SPARK-25442][SQL][K8S] Support STS to run in k8s deployments with spark 
deployment mode as cluster.

commit 78dc1a35f299d854b61c9b03e22730960c6280a2
Author: Suryanarayana GARLAPATI 
Date:   2018-09-19T05:16:48Z

Merge branch 'master' of https://github.com/suryag10/spark

commit a15f5313e7c798e58c80147e575218bb70fe2d74
Author: Suryanarayana GARLAPATI 
Date:   2018-09-19T05:16:48Z

Merge branch 'master' of https://github.com/suryag10/spark

commit 42dd479a33279b85fb9b3fa5a70570970e8148a1
Author: Suryanarayana GARLAPATI 
Date:   2018-09-19T17:08:58Z

Merge branch 'master' of https://github.com/suryag10/spark

commit 4a7e737ae1210451de668b13a72bbd9473721f45
Author: Suryanarayana GARLAPATI 
Date:   2018-09-19T17:08:58Z

Merge branch 'master' of https://github.com/suryag10/spark

commit d91fa2badf33cc4122d340af64ef669dddc66cf1
Author: Suryanarayana GARLAPATI 
Date:   2018-09-19T17:17:20Z

Merge branch 'master' of https://github.com/suryag10/spark

commit a65cfa56a64a7b9fb78a038ce6f9d25f2ce0e428
Author: Suryanarayana GARLAPATI 
Date:   2018-09-19T17:17:20Z

Merge branch 'master' of https://github.com/suryag10/spark

commit 8dc7ced8e44c1d75a27a15a9605d2bd8a693a732
Author: Suryanarayana GARLAPATI 
Date:   2018-09-19T17:23:03Z

Merge branch 'master' of https://github.com/suryag10/spark

commit 6e021e7706c12103ec9ce08b20d7fcb66c83aeb2
Author: Suryanarayana GARLAPATI 
Date:   2018-09-19T17:23:03Z

Merge branch 'master' of https://github.com/suryag10/spark

commit 12be1d2364e852b7cc27b78c0aec9740693e5cab
Author: Suryanarayana GARLAPATI 
Date:   2018-09-21T12:33:12Z

Merge branch 'master' of https://github.com/suryag10/spark




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22433: [SPARK-25442][SQL][K8S] Support STS to run in k8s deploy...

2018-10-21 Thread suryag10
Github user suryag10 commented on the issue:

https://github.com/apache/spark/pull/22433
  
Can some body pls merge this?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22433: [SPARK-25442][SQL][K8S] Support STS to run in k8s deploy...

2018-10-21 Thread suryag10
Github user suryag10 commented on the issue:

https://github.com/apache/spark/pull/22433
  
.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22433: [SPARK-25442][SQL][K8S] Support STS to run in k8s deploy...

2018-10-21 Thread suryag10
Github user suryag10 commented on the issue:

https://github.com/apache/spark/pull/22433
  
> If possible, there should be some basic integration testing. Run a thrift 
server command against the minishift cluster used by the other testing.

Will add this a separate PR.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22433: [SPARK-25442][SQL][K8S] Support STS to run in k8s deploy...

2018-10-21 Thread suryag10
Github user suryag10 commented on the issue:

https://github.com/apache/spark/pull/22433
  
> In the scenario of a cluster-mode submission, what is the command-line 
behavior? Does the thrift-server script "block" until the thrift server pod is 
shut down?

By default the script returns but can be made to block by setting the 
environment variable SPARK_NO_DAEMONIZE. Once this is done, script blocks until 
the thrift server pod is shut down


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22433: [SPARK-25442][SQL][K8S] Support STS to run in k8s deploy...

2018-10-21 Thread suryag10
Github user suryag10 commented on the issue:

https://github.com/apache/spark/pull/22433
  
> In the scenario of a cluster-mode submission, what is the command-line 
behavior? Does the thrift-server script "block" until the thrift server pod is 
shut down?

By default the script returns but can be made to block by setting the 
environment variable SPARK_NO_DAEMONIZE. Once this is done, script blocks until 
the thrift server pod is shut down


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22433: [SPARK-25442][SQL][K8S] Support STS to run in k8s...

2018-10-21 Thread suryag10
Github user suryag10 closed the pull request at:

https://github.com/apache/spark/pull/22433


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22433: [SPARK-25442][SQL][K8S] Support STS to run in k8s...

2018-10-21 Thread suryag10
Github user suryag10 commented on a diff in the pull request:

https://github.com/apache/spark/pull/22433#discussion_r226898234
  
--- Diff: docs/running-on-kubernetes.md ---
@@ -340,6 +340,43 @@ RBAC authorization and how to configure Kubernetes 
service accounts for pods, pl
 [Using RBAC 
Authorization](https://kubernetes.io/docs/admin/authorization/rbac/) and
 [Configure Service Accounts for 
Pods](https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/).
 
+## Running Spark Thrift Server
+
+Thrift JDBC/ODBC Server (aka Spark Thrift Server or STS) is Spark SQL’s 
port of Apache Hive’s HiveServer2 that allows
+JDBC/ODBC clients to execute SQL queries over JDBC and ODBC protocols on 
Apache Spark.
+
+### Client Deployment Mode
+
+To start STS in client mode, excute the following command
+
+```bash
+$ sbin/start-thriftserver.sh \
+--master k8s://https://:
+```
+
+### Cluster Deployment Mode
+
+To start STS in cluster mode, excute the following command
+
+```bash
+$ sbin/start-thriftserver.sh \
+--master k8s://https://: \
+--deploy-mode cluster
+```
+
+The most basic workflow is to use the pod name (driver pod name incase of 
cluster mode and self pod name(pod/container from 
--- End diff --

STS is a server and its best way of deployment in K8S cluster is either 
done through the helm chart or through the yaml file(although it can be done 
through the method you had suggested, but i guess that scenario would be a rare 
case and there will be no HA of the STS server if it is triggered from outside).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22433: [SPARK-25442][SQL][K8S] Support STS to run in k8s deploy...

2018-10-21 Thread suryag10
Github user suryag10 commented on the issue:

https://github.com/apache/spark/pull/22433
  
@liyinan926 
>>The script may be run from a client machine outside a k8s cluster. In 
this case, there's not even a pod. >>I would suggest separating the explanation 
of the user flow details by the deploy mode (client vs cluster).






---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22433: [SPARK-25442][SQL][K8S] Support STS to run in k8s deploy...

2018-10-21 Thread suryag10
Github user suryag10 commented on the issue:

https://github.com/apache/spark/pull/22433
  
@liyinan926 
>>The script may be run from a client machine outside a k8s cluster. In 
this case, there's not even a pod. >>I would suggest separating the explanation 
of the user flow details by the deploy mode (client vs cluster).

STS is a server and its best way of deployment in K8S cluster is either 
done through the helm chart or through the yaml file(although it can be done 
through the method you had suggested, but i guess that scenario would be a rare 
case and there will be no HA of the STS server if it is triggered from outside).




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21669: [SPARK-23257][K8S] Kerberos Support for Spark on K8S

2018-09-30 Thread suryag10
Github user suryag10 commented on the issue:

https://github.com/apache/spark/pull/21669
  
> like it, but we could also first support cluster mode and add client mode 
after.

Thats the reason i said "Point to note":)-


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21669: [SPARK-23257][K8S] Kerberos Support for Spark on K8S

2018-09-29 Thread suryag10
Github user suryag10 commented on the issue:

https://github.com/apache/spark/pull/21669
  
Hi Ilan,
Point to note, this kerberos support for spark k8s is working only for the 
cluster mode deployment. As client mode is also supported now in spark K8S, we 
should plan for supporting the kerberos for client mode as well.
 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22433: [SPARK-25442][SQL][K8S] Support STS to run in k8s deploy...

2018-09-21 Thread suryag10
Github user suryag10 commented on the issue:

https://github.com/apache/spark/pull/22433
  
can somebody pls review and merge?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22433: [SPARK-25442][SQL][K8S] Support STS to run in k8s...

2018-09-21 Thread suryag10
Github user suryag10 commented on a diff in the pull request:

https://github.com/apache/spark/pull/22433#discussion_r219476048
  
--- Diff: docs/running-on-kubernetes.md ---
@@ -340,6 +340,39 @@ RBAC authorization and how to configure Kubernetes 
service accounts for pods, pl
 [Using RBAC 
Authorization](https://kubernetes.io/docs/admin/authorization/rbac/) and
 [Configure Service Accounts for 
Pods](https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/).
 
+## Running Spark Thrift Server
+
+Thrift JDBC/ODBC Server (aka Spark Thrift Server or STS) is Spark SQL’s 
port of Apache Hive’s HiveServer2 that allows
+JDBC/ODBC clients to execute SQL queries over JDBC and ODBC protocols on 
Apache Spark.
+
+### Spark deploy mode of Client
+
+To start STS in client mode, excute the following command
+
+$ sbin/start-thriftserver.sh \
+--master k8s://https://:
+
+### Spark deploy mode of Cluster
+
+To start STS in cluster mode, excute the following command
+
+$ sbin/start-thriftserver.sh \
+--master k8s://https://: \
+--deploy-mode cluster
+
+The most basic workflow is to use the pod name (driver pod name incase of 
cluster mode and self pod name incase of client
--- End diff --

pod/container from which the STS command is executed


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22433: [SPARK-25442][SQL][K8S] Support STS to run in k8s deploy...

2018-09-18 Thread suryag10
Github user suryag10 commented on the issue:

https://github.com/apache/spark/pull/22433
  
@mridulm @liyinan926 @jacobdr @ifilonenko 
code check for space,"/" handling is already present at 
https://github.com/apache/spark/blob/master/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/submit/KubernetesClientApplication.scala#L259

I had reverted back the fix in start-thriftserver.sh. Please review and 
merge.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22433: [SPARK-25442][SQL][K8S] Support STS to run in k8s deploy...

2018-09-18 Thread suryag10
Github user suryag10 commented on the issue:

https://github.com/apache/spark/pull/22433
  
> > Agreed with @mridulm that the naming restriction is specific to k8s and 
should be handled in a k8s specific way, e.g., somewhere around 
https://github.com/apache/spark/blob/master/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/submit/KubernetesClientApplication.scala#L208.
> 
> Ok, Will update the PR with the same.

Hi, Handling of this conversion is already present in 


https://github.com/apache/spark/blob/master/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/submit/KubernetesClientApplication.scala#L259

I had reverted back the change in start-thriftserver.sh file. Please review 
and merge.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22433: [SPARK-25442][SQL][K8S] Support STS to run in k8s deploy...

2018-09-17 Thread suryag10
Github user suryag10 commented on the issue:

https://github.com/apache/spark/pull/22433
  
> Agreed with @mridulm that the naming restriction is specific to k8s and 
should be handled in a k8s specific way, e.g., somewhere around 
https://github.com/apache/spark/blob/master/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/submit/KubernetesClientApplication.scala#L208.

Ok, Will update the PR with the same.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20272: [SPARK-23078] [CORE] [K8s] allow Spark Thrift Server to ...

2018-09-17 Thread suryag10
Github user suryag10 commented on the issue:

https://github.com/apache/spark/pull/20272
  
> If this can get a rebase, and maybe a few sentences in the k8s docs, I'll 
merge.

Hi, I had rebased this patch to the latest master and this patch was 
missing another fix to run the STS on K8S cluster mode. Following is the PR

 https://github.com/apache/spark/pull/22433

Can you please review this once?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22433: [SPARK-25442][SQL][K8S] Support STS to run in k8s deploy...

2018-09-16 Thread suryag10
Github user suryag10 commented on the issue:

https://github.com/apache/spark/pull/22433
  
> I'm wondering, is there some reason this isn't supported in cluster mode 
for yarn & mesos? Or put another way, what is the rationale for k8s being added 
as an exception to this rule?

I donno the specific reason why this was not supported in yarn and mesos. 
The initial contributions to the spark on K8S started with cluster mode(with 
restriction for client mode). So this PR enhances such that STS can run in k8s 
deployments with spark cluster mode(In the latest spark code i had observed 
that the client mode also works(need to cross verify this once)).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22433: [SPARK-25442][SQL][K8S] Support STS to run in k8s deploy...

2018-09-16 Thread suryag10
Github user suryag10 commented on the issue:

https://github.com/apache/spark/pull/22433
  
> It is an implementation detail of k8s integration that application name 
is expected to be DNS compliant ... spark does not have that requirement; and 
yarn/mesos/standalone/local work without this restriction.
> The right fix in k8s integration would be to sanitize the name specified 
by user/application to be compliant with its requirements. This will help not 
just with thrift server, but any spark application.

As this script is common start point for all the resource 
managers(k8s/yarn/mesos/standalone/local), i guess changing this to fit for all 
the cases has a value add, instead of doing at each resource manager level. 
Thoughts?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22433: [SPARK-25442][SQL][K8S] Support STS to run in k8s deploy...

2018-09-16 Thread suryag10
Github user suryag10 commented on the issue:

https://github.com/apache/spark/pull/22433
  
> Does it fail in k8s or does spark k8s code error out ?
> If former, why not fix “name” handling in k8s to replace unsupported 
characters ?

Following is the error seen without the fix:
Exception in thread "main" 
io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST 
at: 
https://k8s-apiserver.bcmt.cluster.local:8443/api/v1/namespaces/default/pods. 
Message: Pod "thrift jdbc/odbc server-1537079590890-driver" is invalid: 
metadata.name: Invalid value: "thrift jdbc/odbc server-1537079590890-driver": a 
DNS-1123 subdomain must consist of lower case alphanumeric characters, '-' or 
'.', and must start and end with an alphanumeric character (e.g. 'example.com', 
regex used for validation is 
'[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*'). Received 
status: Status(apiVersion=v1, code=422, 
details=StatusDetails(causes=[StatusCause(field=metadata.name, message=Invalid 
value: "thrift jdbc/odbc server-1537079590890-driver": a DNS-1123 subdomain 
must consist of lower case alphanumeric characters, '-' or '.', and must start 
and end with an alphanumeric character (e.g. 'example.com', regex used for 
validation is '[a-z0-9]([-a-z0-9]*[a-z0-
 9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*'), reason=FieldValueInvalid, 
additionalProperties={})], group=null, kind=Pod, name=thrift jdbc/odbc 
server-1537079590890-driver, retryAfterSeconds=null, uid=null, 
additionalProperties={}), kind=Status, message=Pod "thrift jdbc/odbc 
server-1537079590890-driver" is invalid: metadata.name: Invalid value: "thrift 
jdbc/odbc server-1537079590890-driver": a DNS-1123 subdomain must consist of 
lower case alphanumeric characters, '-' or '.', and must start and end with an 
alphanumeric character (e.g. 'example.com', regex used for validation is 
'[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*'), 
metadata=ListMeta(resourceVersion=null, selfLink=null, 
additionalProperties={}), reason=Invalid, status=Failure, 
additionalProperties={}).

This is not specific to Kubernetes, but more of a generic DNS (DNS-1123)


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22433: Support STS to run in k8s deployments with spark deploym...

2018-09-15 Thread suryag10
Github user suryag10 commented on the issue:

https://github.com/apache/spark/pull/22433
  
> Thank you for your first contribution, @suryag10 .
> 
> * Could you file a SPARK JIRA issue since this is a code change?
Sure.
> * Could you update the PR title like the other PRs? e.g. 
`[SPARK-XXX][SQL][K8S] ...`?
Sure.
> 
> And, just out of curious, do we need this change?
> 
> ```shell
> - exec "${SPARK_HOME}"/sbin/spark-daemon.sh submit $CLASS 1 --name 
"Thrift JDBC/ODBC Server" "$@"
> + exec "${SPARK_HOME}"/sbin/spark-daemon.sh submit $CLASS 1 --name 
"Thrift-JDBC-ODBC-Server" "$@"
> ```

Without the above change, it fails to start the driver pod as well. Spaces, 
"/" are not allowed for the "name" in the kubernetes world.



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22433: Support STS to run in k8s cluster mode

2018-09-15 Thread suryag10
GitHub user suryag10 opened a pull request:

https://github.com/apache/spark/pull/22433

Support STS to run in k8s cluster mode

## What changes were proposed in this pull request?

Code is enhanced to allow the STS run in kubernetes deployment with spark 
deploy mode of cluster.

  

## How was this patch tested?

Started the sts in cluster mode in K8S deployment and was able to run some 
queries using the beeline client.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/suryag10/spark master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/22433.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #22433


commit 3a7fa571181e4b0494f2b705fbd07bc61b0ca6ce
Author: Suryanarayana GARLAPATI 
Date:   2018-09-16T04:37:26Z

Support STS to run in k8s cluster mode




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-09-05 Thread suryag10
Github user suryag10 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r215338063
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/hadoopsteps/HadoopBootstrapUtil.scala
 ---
@@ -0,0 +1,186 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features.hadoopsteps
+
+import java.io.File
+
+import scala.collection.JavaConverters._
+
+import com.google.common.base.Charsets
+import com.google.common.io.Files
+import io.fabric8.kubernetes.api.model.{ConfigMap, ConfigMapBuilder, 
ContainerBuilder, KeyToPathBuilder, PodBuilder}
+
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.SparkPod
+import 
org.apache.spark.deploy.k8s.security.KubernetesHadoopDelegationTokenManager
+
+private[spark] object HadoopBootstrapUtil {
+
+   /**
+* Mounting the DT secret for both the Driver and the executors
+*
+* @param dtSecretName Name of the secret that stores the Delegation 
Token
+* @param dtSecretItemKey Name of the Item Key storing the Delegation 
Token
+* @param userName Name of the SparkUser to set SPARK_USER
+* @param fileLocation Location of the krb5 file
+* @param krb5ConfName Name of the ConfigMap for Krb5
+* @param pod Input pod to be appended to
+* @return a modified SparkPod
+*/
+  def bootstrapKerberosPod(
+  dtSecretName: String,
+  dtSecretItemKey: String,
+  userName: String,
+  fileLocation: String,
+  krb5ConfName: String,
+  pod: SparkPod) : SparkPod = {
+  val krb5File = new File(fileLocation)
+  val fileStringPath = krb5File.toPath.getFileName.toString
+  val kerberizedPod = new PodBuilder(pod.pod)
+.editOrNewSpec()
+  .addNewVolume()
+.withName(SPARK_APP_HADOOP_SECRET_VOLUME_NAME)
+.withNewSecret()
+  .withSecretName(dtSecretName)
+  .endSecret()
+.endVolume()
+  .addNewVolume()
+.withName(KRB_FILE_VOLUME)
+  .withNewConfigMap()
+.withName(krb5ConfName)
+.withItems(new KeyToPathBuilder()
+  .withKey(fileStringPath)
+  .withPath(fileStringPath)
+  .build())
+.endConfigMap()
+  .endVolume()
+// TODO: (ifilonenko) make configurable PU(G)ID
+  .editOrNewSecurityContext()
+.withRunAsUser(1000L)
+.withFsGroup(2000L)
+.endSecurityContext()
+  .endSpec()
+.build()
+  val kerberizedContainer = new ContainerBuilder(pod.container)
+.addNewVolumeMount()
+  .withName(SPARK_APP_HADOOP_SECRET_VOLUME_NAME)
+  .withMountPath(SPARK_APP_HADOOP_CREDENTIALS_BASE_DIR)
+  .endVolumeMount()
+.addNewVolumeMount()
+  .withName(KRB_FILE_VOLUME)
+  .withMountPath(KRB_FILE_DIR_PATH)
--- End diff --

Hi Stavros,
The main aim of the following step was to mount the krb5.conf file into 
/etc/ directory.
.addNewVolumeMount()
  .withName(KRB_FILE_VOLUME)
  .withMountPath(KRB_FILE_DIR_PATH) 
when the above is done all the contents which are actually  present inside 
the container in "/etc/" directory are lost and gets mounted only with the 
krb5.conf file with read permissions. As other contents are lost from "/etc" 
directory nothing works. As i had commented earlier as well, this makes the 
driver pod to fail (spawn fails) as well. To make things correct and mount only 
the krb5.conf file,   following should be done:
.addNewVolumeMount()
   .withMountPath(KRB_FILE_DIR_PATH + "/krb5.conf")
   .withSubPath("krb5.conf")

Wit

[GitHub] spark issue #21669: [SPARK-23257][K8S][WIP] Kerberos Support for Spark on K8...

2018-09-03 Thread suryag10
Github user suryag10 commented on the issue:

https://github.com/apache/spark/pull/21669
  
Hi Ilan,
I was able to make work the Kerberos with one work around(which I am trying 
to do a full fix) and one fix. 
Fix is the one which i had commented earlier and is as follows:

code changed From
 .withName(KRB_FILE_VOLUME)
 .withMountPath(KRB_FILE_DIR_PATH)

to 
 .withName(KRB_FILE_VOLUME)
.withMountPath(KRB_FILE_DIR_PATH + "/krb5.conf")
.withSubPath("krb5.conf")

Work around is described as follows:
What I had observed was when executor pod was being created, the following 
function is being called twice (file KerberosConfExecutorFeatureStep.scala):

override def configurePod(pod: SparkPod): SparkPod = {
 
and following error was coming.

2018-09-02 05:01:12 ERROR Utils:91 - Uncaught exception in thread 
kubernetes-executor-snapshots-subscribers-1
io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: 
POST at: https://kubernetes.default.svc/api/v1/namespaces/default/pods. 
Message: Pod "kerberos-1535864447099-exec-1" is invalid: [spec.volumes[4].name: 
Duplicate value: "hadoop-secret", spec.volumes[5].name: Duplicate value: 
"krb5-file", spec.containers[0].volumeMounts[4].mountPath: Invalid value: 
"/mnt/secrets/hadoop-credentials": must be unique, 
spec.containers[0].volumeMounts[5].mountPath: Invalid value: "/etc/krb5.conf": 
must be unique]. Received status: Status(apiVersion=v1, code=422, 
details=StatusDetails(causes=[StatusCause(field=spec.volumes[4].name, 
message=Duplicate value: "hadoop-secret", reason=FieldValueDuplicate, 
additionalProperties={}), StatusCause(field=spec.volumes[5].name, 
message=Duplicate value: "krb5-file", reason=FieldValueDuplicate, 
additionalProperties={}), 
StatusCause(field=spec.containers[0].volumeMounts[4].mountPath, message=Invalid 
value: "/mnt/secrets/hadoop-credentials": must
  be unique, reason=FieldValueInvalid, additionalProperties={}), 
StatusCause(field=spec.containers[0].volumeMounts[5].mountPath, message=Invalid 
value: "/etc/krb5.conf": must be unique, reason=FieldValueInvalid, 
additionalProperties={})], group=null, kind=Pod, 
name=kerberos-1535864447099-exec-1, retryAfterSeconds=null, uid=null, 
additionalProperties={}), kind=Status, message=Pod 
"kerberos-1535864447099-exec-1" is invalid: [spec.volumes[4].name: Duplicate 
value: "hadoop-secret", spec.volumes[5].name: Duplicate value: "krb5-file", 
spec.containers[0].volumeMounts[4].mountPath: Invalid value: 
"/mnt/secrets/hadoop-credentials": must be unique, 
spec.containers[0].volumeMounts[5].mountPath: Invalid value: "/etc/krb5.conf": 
must be unique], metadata=ListMeta(resourceVersion=null, selfLink=null, 
additionalProperties={}), reason=Invalid, status=Failure, 
additionalProperties={}).

So I had kept a work around not to call the above function second time when 
an executor pod was being created. 

With the above I was able to run successfully both hdfs and hive sample 
text cases. 

Regards
Surya



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21669: [SPARK-23257][K8S][WIP] Kerberos Support for Spar...

2018-08-28 Thread suryag10
Github user suryag10 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r213408857
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/hadoopsteps/HadoopBootstrapUtil.scala
 ---
@@ -0,0 +1,186 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features.hadoopsteps
+
+import java.io.File
+
+import scala.collection.JavaConverters._
+
+import com.google.common.base.Charsets
+import com.google.common.io.Files
+import io.fabric8.kubernetes.api.model.{ConfigMap, ConfigMapBuilder, 
ContainerBuilder, KeyToPathBuilder, PodBuilder}
+
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.SparkPod
+import 
org.apache.spark.deploy.k8s.security.KubernetesHadoopDelegationTokenManager
+
+private[spark] object HadoopBootstrapUtil {
+
+   /**
+* Mounting the DT secret for both the Driver and the executors
+*
+* @param dtSecretName Name of the secret that stores the Delegation 
Token
+* @param dtSecretItemKey Name of the Item Key storing the Delegation 
Token
+* @param userName Name of the SparkUser to set SPARK_USER
+* @param fileLocation Location of the krb5 file
+* @param krb5ConfName Name of the ConfigMap for Krb5
+* @param pod Input pod to be appended to
+* @return a modified SparkPod
+*/
+  def bootstrapKerberosPod(
+  dtSecretName: String,
+  dtSecretItemKey: String,
+  userName: String,
+  fileLocation: String,
+  krb5ConfName: String,
+  pod: SparkPod) : SparkPod = {
+  val krb5File = new File(fileLocation)
+  val fileStringPath = krb5File.toPath.getFileName.toString
+  val kerberizedPod = new PodBuilder(pod.pod)
+.editOrNewSpec()
+  .addNewVolume()
+.withName(SPARK_APP_HADOOP_SECRET_VOLUME_NAME)
+.withNewSecret()
+  .withSecretName(dtSecretName)
+  .endSecret()
+.endVolume()
+  .addNewVolume()
+.withName(KRB_FILE_VOLUME)
+  .withNewConfigMap()
+.withName(krb5ConfName)
+.withItems(new KeyToPathBuilder()
+  .withKey(fileStringPath)
+  .withPath(fileStringPath)
+  .build())
+.endConfigMap()
+  .endVolume()
+// TODO: (ifilonenko) make configurable PU(G)ID
+  .editOrNewSecurityContext()
+.withRunAsUser(1000L)
+.withFsGroup(2000L)
+.endSecurityContext()
+  .endSpec()
+.build()
+  val kerberizedContainer = new ContainerBuilder(pod.container)
+.addNewVolumeMount()
+  .withName(SPARK_APP_HADOOP_SECRET_VOLUME_NAME)
+  .withMountPath(SPARK_APP_HADOOP_CREDENTIALS_BASE_DIR)
+  .endVolumeMount()
+.addNewVolumeMount()
+  .withName(KRB_FILE_VOLUME)
+  .withMountPath(KRB_FILE_DIR_PATH)
--- End diff --

This makes the "/etc/" mounted as read only. I guess we should use as below:
  .withMountPath(KRB_FILE_DIR_PATH + "/krb5.conf")
  .withSubPath("krb5.conf")



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org