[jira] [Comment Edited] (SPARK-25355) Support --proxy-user for Spark on K8s

Shrikant (Jira) Tue, 03 May 2022 22:18:07 -0700


    [ 
https://issues.apache.org/jira/browse/SPARK-25355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17531480#comment-17531480
 ]


Shrikant edited comment on SPARK-25355 at 5/4/22 5:17 AM:
----------------------------------------------------------

[~gaborgsomogyi] I have attached the full driver log in Jira. There are no 
executor logs since executors were not launched due to driver failure. Also, 
attached the client console log (client.log)

As the deploy-mode in spark submit indicates, we are submitting the job in 
cluster mode on Spark on Kubernetes. Spark submit on the driver pod is launched 
in client mode (It's being done by the Spark code in case of Spark on 
Kubernetes. It's not that we are explicitly doing this). The logs and the 
submit command are of the same job. 

> What is the master plan to provide a TGT for the current user on the driver 
> POD?

This is what we are not clear about. What needs to be done to make the 
delegation token available for the proxy user. When we don't use proxy user, 
HADOOP_TOKEN_FILE_LOCATION is used to get the tokens and authentication is done 
using this token.


was (Author: JIRAUSER280449):
[~gaborgsomogyi] I have attached the full driver log in Jira. There are no 
executor logs since executors were not launched due to driver failure.

As the deploy-mode in spark submit indicates, we are submitting the job in 
cluster mode on Spark on Kubernetes. Spark submit on the driver pod is launched 
in client mode (It's being done by the Spark code in case of Spark on 
Kubernetes. It's not that we are explicitly doing this). The logs and the 
submit command are of the same job. 

> What is the master plan to provide a TGT for the current user on the driver 
> POD?

This is what we are not clear about. What needs to be done to make the 
delegation token available for the proxy user. When we don't use proxy user, 
HADOOP_TOKEN_FILE_LOCATION is used to get the tokens and authentication is done 
using this token.

> Support --proxy-user for Spark on K8s
> -------------------------------------
>
>                 Key: SPARK-25355
>                 URL: https://issues.apache.org/jira/browse/SPARK-25355
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Kubernetes, Spark Core
>    Affects Versions: 3.1.0
>            Reporter: Stavros Kontopoulos
>            Assignee: Pedro Rossi
>            Priority: Major
>             Fix For: 3.1.0
>
>         Attachments: client.log, driver.log
>
>
> SPARK-23257 adds kerberized hdfs support for Spark on K8s. A major addition 
> needed is the support for proxy user. A proxy user is impersonated by a 
> superuser who executes operations on behalf of the proxy user. More on this: 
> [https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Superusers.html]
> [https://github.com/spark-notebook/spark-notebook/blob/master/docs/proxyuser_impersonation.md]
> This has been implemented for Yarn upstream and Spark on Mesos here:
> [https://github.com/mesosphere/spark/pull/26]
> [~ifilonenko] creating this issue according to our discussion.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-25355) Support --proxy-user for Spark on K8s

Reply via email to