[ 
https://issues.apache.org/jira/browse/SPARK-12800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216243#comment-15216243
 ] 

Thomas Graves commented on SPARK-12800:
---------------------------------------

You are talking about launching a job using org.apache.spark.deploy.yarn.Client 
directly, correct?  If so, we don't officially support that, I realize it isn't 
currently private and some people are using it but in 2.0 that will be made 
private.



> Subtle bug on Spark Yarn Client under Kerberos Security Mode
> ------------------------------------------------------------
>
>                 Key: SPARK-12800
>                 URL: https://issues.apache.org/jira/browse/SPARK-12800
>             Project: Spark
>          Issue Type: Bug
>          Components: YARN
>    Affects Versions: 1.5.1, 1.5.2
>            Reporter: Chester
>
> Version used: Spark 1.5.1 (1.5.2-SNAPSHOT) 
> Deployment Mode: Yarn-Cluster
> Problem observed: 
>   When running spark job directly from YarnClient (without using 
> spark-submit, I did not verify the spark-submit has the same issue or not), 
> when kerberos security is enabled, the first time run spark job always fail. 
> The failure is due to that the hadoop consider the job is in SIMPLE model 
> rather than Kerberos mode.  But without shutting down the JVM, run the same 
> job again, the spark job will pass. If one restart the JVM, then the spark 
> job will fail again. 
> The cause: 
>   Tracking down the source of the issue, I found that the problem seems lie 
> at the spark Yarn Client.scala. In the Client
> {code}
> def prepareLocalResources() method  L 266 of Client.java, the following line 
> code is called. 
>  YarnSparkHadoopUtil.get.obtainTokensForNamenodes(nns, hadoopConf, 
> credentials)
> {code}
> The YarnSparkHadoopUtil.get is in turns get initialized via reflection
> {code}
> object SparkHadoopUtil {
>   private val hadoop = {
>     val yarnMode = java.lang.Boolean.valueOf(
>         System.getProperty("SPARK_YARN_MODE", 
> System.getenv("SPARK_YARN_MODE")))
>     if (yarnMode) {
>       try {
>         Utils.classForName("org.apache.spark.deploy.yarn.YarnSparkHadoopUtil")
>           .newInstance()
>           .asInstanceOf[SparkHadoopUtil]
>       } catch {
>        case e: Exception => throw new SparkException("Unable to load YARN 
> support", e)
>       }
>     } else {
>       new SparkHadoopUtil
>     }
>   } 
>   def get: SparkHadoopUtil = {
>     hadoop
>   }
> }
>  
> class SparkHadoopUtil extends Logging {
>   private val sparkConf = new SparkConf()
>   val conf: Configuration = newConfiguration(sparkConf)
>   UserGroupInformation.setConfiguration(conf)
>    .... rest of line
> }
> {code}
> Here SparkHadoopUtil creates a empty SparkConf and Hadoop Configuration from 
> that and set to UserGroupInformation
> {code}
>   UserGroupInformation.setConfiguration(conf)
> {code}
>   As the UserGroupInformation.authenticationMethod is static, above all wipe 
> out the security settings. UserGroupInformation.isSecurityEnabled() changed 
> from true to false. Thus the sequence call will fail. 
>  Since the SparkHadoopUtil.hadoop is static/non-mutable variable, so 
> the next run it will be not create again, then 
> UserGroupInformation.setConfiguration(conf) 
> will not be called again, so the sequence spark job works. 
> The work around: 
> {code}
>         //first initialize the SparkHadoopUtil, which will create a static 
> instance
>         //which will set UserGroupInformation to a empty hadoop Configuration.
>         //we will need to reset the UserGroupInformation after that.
>         val util = SparkHadoopUtil.get
>         UserGroupInformation.setConfiguration(hadoopConf)
> {code}
>       Then call 
>           client.run() 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to