[ 
https://issues.apache.org/jira/browse/SPARK-33485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuan Jiao updated SPARK-33485:
------------------------------
    Description: 
My spark application accessing kerberized HDFS is running in kubernetes 
cluster, but the application log shows: "Setting 
spark.hadoop.yarn.resourcemanager.principal to tester(which is one of my 
kerberos principals, yet I uses the other principal joan to read HDFS files)":

... 

+ CMD=("$SPARK_HOME/bin/spark-submit" --conf 
"spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS" --deploy-mode client "$@")
+ exec /usr/bin/tini -s -- /opt/spark/bin/spark-submit --conf 
spark.driver.bindAddress=10.244.1.61 --deploy-mode client --properties-file 
/opt/spark/conf/spark.properties --class WordCount 
local:///opt/spark/jars/WordCount-1.0-SNAPSHOT.jar
*Setting spark.hadoop.yarn.resourcemanager.principal to tester*

 

...

20/11/19 04:31:28 INFO HadoopFSDelegationTokenProvider: getting token for: 
DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-1041285450_1, 
ugi=*tester@JOANTEST* (auth:KERBEROS)]] with renewer tester
20/11/19 04:31:37 INFO DFSClient: Created HDFS_DELEGATION_TOKEN token 60 for 
tester on ha-hdfs:nameservice1
20/11/19 04:31:37 INFO HadoopFSDelegationTokenProvider: getting token for: 
DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-1041285450_1, 
ugi=*tester@JOANTEST* (auth:KERBEROS)]] with renewer tester@JOANTEST
20/11/19 04:31:37 INFO DFSClient: Created HDFS_DELEGATION_TOKEN token 61 for 
*tester* on ha-hdfs:nameservice1
20/11/19 04:31:37 INFO HadoopFSDelegationTokenProvider: Renewal interval is 
86400073 for token HDFS_DELEGATION_TOKEN

...

 20/11/19 04:31:51 INFO UserGroupInformation: *Login successful for user joan 
using keytab file /opt/hadoop/conf/joan.keytab*

...

 

I don't know why yarn authentication is needed here?And why use the principal 
tester for autherization? Anyone can help? Thanks !

The log and my spark project is attached blow for reference.

 

  was:
My spark application accessing kerberized HDFS is running in kubernetes 
cluster, but the application log shows: "Setting 
spark.hadoop.yarn.resourcemanager.principal to joan1(which is one of my 
kerberos principals, yet I am not using this principal to read HDFS files)":

... 

+ SPARK_CLASSPATH='/opt/hadoop/conf::/opt/spark/jars/*'
 + case "$1" in
 + shift 1
 + CMD=("$SPARK_HOME/bin/spark-submit" --conf 
"spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS" --deploy-mode client "$@")
 + exec /usr/bin/tini -s – /opt/spark/bin/spark-submit --conf 
spark.driver.bindAddress=10.244.1.67 --deploy-mode client --properties-file 
/opt/spark/conf/spark.properties --class WordCount 
local:///opt/spark/jars/WordCount-1.0-SNAPSHOT.jar
 +*Setting spark.hadoop.yarn.resourcemanager.principal to joan1*+

...

20/11/19 05:43:07 INFO HadoopFSDelegationTokenProvider: getting token for: 
DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-1695624590_1, 
ugi=joan1@JOANTEST (auth:KERBEROS)]] with renewer joan1
 20/11/19 05:43:16 INFO DFSClient: Created HDFS_DELEGATION_TOKEN token 71 for 
joan1 on ha-hdfs:nameservice1
 20/11/19 05:43:16 INFO HadoopFSDelegationTokenProvider: getting token for: 
DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-1695624590_1, 
ugi=joan1@JOANTEST (auth:KERBEROS)]] with renewer joan1@JOANTEST
 20/11/19 05:43:16 INFO DFSClient: Created HDFS_DELEGATION_TOKEN token 72 for 
joan1 on ha-hdfs:nameservice1

...

 

I don't know why yarn authentication is needed here? Anyone can help?Thanks!

The log and my spark project is attached blow for reference.

 


> running spark application in kerbernetes,bug the application log shows yarn 
> authentications 
> --------------------------------------------------------------------------------------------
>
>                 Key: SPARK-33485
>                 URL: https://issues.apache.org/jira/browse/SPARK-33485
>             Project: Spark
>          Issue Type: Bug
>          Components: Kubernetes
>    Affects Versions: 3.0.0
>            Reporter: Yuan Jiao
>            Priority: Major
>         Attachments: application.log, project.rar
>
>
> My spark application accessing kerberized HDFS is running in kubernetes 
> cluster, but the application log shows: "Setting 
> spark.hadoop.yarn.resourcemanager.principal to tester(which is one of my 
> kerberos principals, yet I uses the other principal joan to read HDFS files)":
> ... 
> + CMD=("$SPARK_HOME/bin/spark-submit" --conf 
> "spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS" --deploy-mode client 
> "$@")
> + exec /usr/bin/tini -s -- /opt/spark/bin/spark-submit --conf 
> spark.driver.bindAddress=10.244.1.61 --deploy-mode client --properties-file 
> /opt/spark/conf/spark.properties --class WordCount 
> local:///opt/spark/jars/WordCount-1.0-SNAPSHOT.jar
> *Setting spark.hadoop.yarn.resourcemanager.principal to tester*
>  
> ...
> 20/11/19 04:31:28 INFO HadoopFSDelegationTokenProvider: getting token for: 
> DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-1041285450_1, 
> ugi=*tester@JOANTEST* (auth:KERBEROS)]] with renewer tester
> 20/11/19 04:31:37 INFO DFSClient: Created HDFS_DELEGATION_TOKEN token 60 for 
> tester on ha-hdfs:nameservice1
> 20/11/19 04:31:37 INFO HadoopFSDelegationTokenProvider: getting token for: 
> DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-1041285450_1, 
> ugi=*tester@JOANTEST* (auth:KERBEROS)]] with renewer tester@JOANTEST
> 20/11/19 04:31:37 INFO DFSClient: Created HDFS_DELEGATION_TOKEN token 61 for 
> *tester* on ha-hdfs:nameservice1
> 20/11/19 04:31:37 INFO HadoopFSDelegationTokenProvider: Renewal interval is 
> 86400073 for token HDFS_DELEGATION_TOKEN
> ...
>  20/11/19 04:31:51 INFO UserGroupInformation: *Login successful for user joan 
> using keytab file /opt/hadoop/conf/joan.keytab*
> ...
>  
> I don't know why yarn authentication is needed here?And why use the principal 
> tester for autherization? Anyone can help? Thanks !
> The log and my spark project is attached blow for reference.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to