[ 
https://issues.apache.org/jira/browse/SPARK-21156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16056524#comment-16056524
 ] 

Monica Raj commented on SPARK-21156:
------------------------------------

This issue has been seen with spark-1.6.1 and Ranger KMS from HortonWorks and 
also with spark-2.1.1 and KeyTrustee KMS from Cloudera

> Spark cannot handle multiple KMS server configuration
> -----------------------------------------------------
>
>                 Key: SPARK-21156
>                 URL: https://issues.apache.org/jira/browse/SPARK-21156
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Shell, Spark Submit, YARN
>    Affects Versions: 1.6.1, 2.1.1
>            Reporter: Monica Raj
>
> The *dfs.encryption.key.provider.uri* config parameter present in the 
> *hdfs-site.xml* file can have one or more key servers in the value. The 
> syntax for this value is:
> <name>dfs.encryption.key.provider.uri</name>
> <value>kms://http@<internal host name1>;<internal host 
> name2>;...:9292/kms</value>
> as per documentation: 
> https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.3/bk_security/content/ranger_kms_multi_kms.html
> If multiple KMS servers are configured in the above field AND the following 
> Spark config values are specified:
> *spark.yarn.principal*
> *spark.yarn.keytab*
> then it is not possible to create a spark context. There is an error parsing 
> the syntax for multiple KMS servers.  Below is the stack trace for the same 
> error seen via Spark shell and also seen via Zeppelin.
> *Error via Spark Shell*
> 17/06/16 22:02:11 ERROR SparkContext: Error initializing SparkContext.
> java.lang.IllegalArgumentException: java.net.UnknownHostException: 
> mas1.multikms.com;mas2.multikms.com
>       at 
> org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:374)
>       at 
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.getDelegationTokenService(KMSClientProvider.java:804)
>       at 
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.addDelegationTokens(KMSClientProvider.java:779)
>       at 
> org.apache.hadoop.crypto.key.KeyProviderDelegationTokenExtension.addDelegationTokens(KeyProviderDelegationTokenExtension.java:86)
>       at 
> org.apache.hadoop.hdfs.DistributedFileSystem.addDelegationTokens(DistributedFileSystem.java:2046)
>       at 
> org.apache.spark.deploy.yarn.YarnSparkHadoopUtil$$anonfun$obtainTokensForNamenodes$1.apply(YarnSparkHadoopUtil.scala:131)
>       at 
> org.apache.spark.deploy.yarn.YarnSparkHadoopUtil$$anonfun$obtainTokensForNamenodes$1.apply(YarnSparkHadoopUtil.scala:128)
>       at scala.collection.immutable.Set$Set1.foreach(Set.scala:74)
>       at 
> org.apache.spark.deploy.yarn.YarnSparkHadoopUtil.obtainTokensForNamenodes(YarnSparkHadoopUtil.scala:128)
>       at 
> org.apache.spark.deploy.yarn.Client.getTokenRenewalInterval(Client.scala:593)
>       at org.apache.spark.deploy.yarn.Client.setupLaunchEnv(Client.scala:626)
>       at 
> org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:726)
>       at 
> org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:142)
>       at 
> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57)
>       at 
> org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:144)
>       at org.apache.spark.SparkContext.<init>(SparkContext.scala:530)
>       at 
> org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:1017)
>       at $line3.$read$$iwC$$iwC.<init>(<console>:15)
>       at $line3.$read$$iwC.<init>(<console>:24)
>       at $line3.$read.<init>(<console>:26)
>       at $line3.$read$.<init>(<console>:30)
>       at $line3.$read$.<clinit>(<console>)
>       at $line3.$eval$.<init>(<console>:7)
>       at $line3.$eval$.<clinit>(<console>)
>       at $line3.$eval.$print(<console>)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
>       at 
> org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346)
>       at 
> org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
>       at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
>       at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
>       at 
> org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
>       at 
> org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
>       at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
>       at 
> org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:125)
>       at 
> org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:124)
>       at org.apache.spark.repl.SparkIMain.beQuietDuring(SparkIMain.scala:324)
>       at 
> org.apache.spark.repl.SparkILoopInit$class.initializeSpark(SparkILoopInit.scala:124)
>       at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:64)
>       at 
> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1$$anonfun$apply$mcZ$sp$5.apply$mcV$sp(SparkILoop.scala:974)
>       at 
> org.apache.spark.repl.SparkILoopInit$class.runThunks(SparkILoopInit.scala:159)
>       at org.apache.spark.repl.SparkILoop.runThunks(SparkILoop.scala:64)
>       at 
> org.apache.spark.repl.SparkILoopInit$class.postInitialization(SparkILoopInit.scala:108)
>       at 
> org.apache.spark.repl.SparkILoop.postInitialization(SparkILoop.scala:64)
>       at 
> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:991)
>       at 
> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
>       at 
> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
>       at 
> scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
>       at 
> org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
>       at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
>       at org.apache.spark.repl.Main$.main(Main.scala:31)
>       at org.apache.spark.repl.Main.main(Main.scala)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
>       at 
> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
>       at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
>       at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
>       at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> Caused by: java.net.UnknownHostException: mas1.multikms.com;mas2.multikms.com
>       ... 64 more
> *Error via Zeppelin*
> ERROR [2017-06-19 23:24:49,508] ({pool-2-thread-2} 
> Logging.scala[logError]:95) - Error initializing SparkContext.
>  70 java.lang.IllegalArgumentException: java.net.UnknownHostException: 
> mas1.multikms.com;mas2.multikms.com
>  71         at 
> org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:374)
>  72         at 
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.getDelegationTokenService(KMSClientProvider.java:804)
>  73         at 
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.addDelegationTokens(KMSClientProvider.java:779)
>  74         at 
> org.apache.hadoop.crypto.key.KeyProviderDelegationTokenExtension.addDelegationTokens(KeyProviderDelegationTokenExtension.java:86)
>  75         at 
> org.apache.hadoop.hdfs.DistributedFileSystem.addDelegationTokens(DistributedFileSystem.java:2046)
>  76         at 
> org.apache.spark.deploy.yarn.YarnSparkHadoopUtil$$anonfun$obtainTokensForNamenodes$1.apply(YarnSparkHadoopUtil.scala:131)
>  77         at 
> org.apache.spark.deploy.yarn.YarnSparkHadoopUtil$$anonfun$obtainTokensForNamenodes$1.apply(YarnSparkHadoopUtil.scala:128)
>  78         at scala.collection.immutable.Set$Set1.foreach(Set.scala:74)
>  79         at 
> org.apache.spark.deploy.yarn.YarnSparkHadoopUtil.obtainTokensForNamenodes(YarnSparkHadoopUtil.scala:128)
>  80         at 
> org.apache.spark.deploy.yarn.Client.getTokenRenewalInterval(Client.scala:593)
>  81         at 
> org.apache.spark.deploy.yarn.Client.setupLaunchEnv(Client.scala:626)
>  82         at 
> org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:726)
>  83         at 
> org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:142)
>  84         at 
> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57)
>  85         at 
> org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:144)
>  86         at org.apache.spark.SparkContext.<init>(SparkContext.scala:530)
>  87         at 
> org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:338)
>  88         at 
> org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:122)
>  89         at 
> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:513)
>  90         at 
> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
>  91         at 
> org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:93)
>  92         at 
> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:341)
>  93         at org.apache.zeppelin.scheduler.Job.run(Job.java:176)
>  94         at 
> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
>  95         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  96         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  97         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>  98         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>  99         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 100         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 101         at java.lang.Thread.run(Thread.java:748)
> 102 Caused by: java.net.UnknownHostException: 
> mas1.multikms.com;mas2.multikms.com
> 103         ... 31 more



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to