Re: Kerberos and YARN - functions in spark-shell and spark submit local but not cluster mode

2016-12-08 Thread Marcelo Vanzin
You could have posted just the error, which is at the end of my response.

Why are you trying to use WebHDFS? I'm not really sure how
authentication works with that. But generally applications use HDFS
(which uses a different URI scheme), and Spark should work fine with
that.


Error:
Authentication required
org.apache.hadoop.security.AccessControlException: Authentication required
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:457)
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:113)
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:738)
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:582)
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:612)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:608)
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:1507)
at org.apache.hadoop.fs.FileSystem.collectDelegationTokens(FileSystem.java:545)
at org.apache.hadoop.fs.FileSystem.addDelegationTokens(FileSystem.java:523)
at 
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:140)
at 
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:100)
at 
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:80)
at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:206)
at 
org.apache.hadoop.mapred.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:45)
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:315)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:199)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:242)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:240)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:240)


On Thu, Dec 8, 2016 at 12:29 PM, Gerard Casey  wrote:
> Sure - I wanted to check with admin before sharing. I’ve attached it now, 
> does this help?
>
> Many thanks again,
>
> G
>
>
>
>> On 8 Dec 2016, at 20:18, Marcelo Vanzin  wrote:
>>
>> Then you probably have a configuration error somewhere. Since you
>> haven't actually posted the error you're seeing, it's kinda hard to
>> help any further.
>>
>> On Thu, Dec 8, 2016 at 11:17 AM, Gerard Casey  
>> wrote:
>>> Right. I’m confident that is setup correctly.
>>>
>>> I can run the SparkPi test script. The main difference between it and my 
>>> application is that it doesn’t access HDFS.
>>>
 On 8 Dec 2016, at 18:43, Marcelo Vanzin  wrote:

 On Wed, Dec 7, 2016 at 11:54 PM, Gerard Casey  
 wrote:
> To be specific, where exactly should spark.authenticate be set to true?

 spark.authenticate has nothing to do with kerberos. It's for
 authentication between different Spark processes belonging to the same
 app.

 --
 Marcelo

 -
 To unsubscribe e-mail: user-unsubscr...@spark.apache.org

>>>
>>
>>
>>
>> --
>> Marcelo
>>
>> -
>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>
>
>



-- 
Marcelo

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: Kerberos and YARN - functions in spark-shell and spark submit local but not cluster mode

2016-12-08 Thread Gerard Casey
Sure - I wanted to check with admin before sharing. I’ve attached it now, does 
this help?

Many thanks again,

G

Container: container_e34_1479877553404_0174_01_03 on 
hdp-node12.xcat.cluster_45454_1481228528201

LogType:directory.info
Log Upload Time:Thu Dec 08 20:22:08 + 2016
LogLength:5138
Log Contents:
ls -l:
total 28
lrwxrwxrwx 1 my_user_name hadoop   70 Dec  8 20:21 __app__.jar -> 
/hadoop/yarn/local/usercache/my_user_name/filecache/26/graphx_sp_2.10-1.0.jar
lrwxrwxrwx 1 my_user_name hadoop   63 Dec  8 20:21 __spark__.jar -> 
/hadoop_1/hadoop/yarn/local/filecache/11/spark-hdp-assembly.jar
lrwxrwxrwx 1 my_user_name hadoop   94 Dec  8 20:21 __spark_conf__ -> 
/hadoop_1/hadoop/yarn/local/usercache/my_user_name/filecache/25/__spark_conf__2528926660896665250.zip
-rw--- 1 my_user_name hadoop  340 Dec  8 20:21 container_tokens
-rwx-- 1 my_user_name hadoop 6195 Dec  8 20:21 launch_container.sh
drwxr-s--- 2 my_user_name hadoop 4096 Dec  8 20:21 tmp
find -L . -maxdepth 5 -ls:
27527034 drwxr-s---   3 my_user_namehadoop   4096 Dec  8 20:21 .
106430595 184304 -r-xr-xr-x   1 yarn hadoop   188727178 Dec  5 18:22 
./__spark__.jar
1064315274 drwx--   2 my_user_namemy_user_name4096 Dec  8 
20:21 ./__spark_conf__
1064315594 -r-x--   1 my_user_namemy_user_name 951 Dec  8 
20:21 ./__spark_conf__/mapred-env.cmd
1064315584 -r-x--   1 my_user_namemy_user_name1000 Dec  8 
20:21 ./__spark_conf__/ssl-server.xml
1064315288 -r-x--   1 my_user_namemy_user_name5410 Dec  8 
20:21 ./__spark_conf__/hadoop-env.sh
1064315534 -r-x--   1 my_user_namemy_user_name2316 Dec  8 
20:21 ./__spark_conf__/ssl-client.xml.example
1064315324 -r-x--   1 my_user_namemy_user_name3979 Dec  8 
20:21 ./__spark_conf__/hadoop-env.cmd
106431546   12 -r-x--   1 my_user_namemy_user_name8217 Dec  8 
20:21 ./__spark_conf__/hdfs-site.xml
1064315458 -r-x--   1 my_user_namemy_user_name5637 Dec  8 
20:21 ./__spark_conf__/yarn-env.sh
1064315524 -r-x--   1 my_user_namemy_user_name1602 Dec  8 
20:21 ./__spark_conf__/health_check
1064315374 -r-x--   1 my_user_namemy_user_name1631 Dec  8 
20:21 ./__spark_conf__/kms-log4j.properties
1064315638 -r-x--   1 my_user_namemy_user_name5511 Dec  8 
20:21 ./__spark_conf__/kms-site.xml
1064315308 -r-x--   1 my_user_namemy_user_name7353 Dec  8 
20:21 ./__spark_conf__/mapred-site.xml
1064315484 -r-x--   1 my_user_namemy_user_name1072 Dec  8 
20:21 ./__spark_conf__/container-executor.cfg
1064315360 -r-x--   1 my_user_namemy_user_name   0 Dec  8 
20:21 ./__spark_conf__/yarn.exclude
1064315628 -r-x--   1 my_user_namemy_user_name4113 Dec  8 
20:21 ./__spark_conf__/mapred-queues.xml.template
1064315384 -r-x--   1 my_user_namemy_user_name2250 Dec  8 
20:21 ./__spark_conf__/yarn-env.cmd
1064315474 -r-x--   1 my_user_namemy_user_name1020 Dec  8 
20:21 ./__spark_conf__/commons-logging.properties
1064315434 -r-x--   1 my_user_namemy_user_name 758 Dec  8 
20:21 ./__spark_conf__/mapred-site.xml.template
1064315544 -r-x--   1 my_user_namemy_user_name1527 Dec  8 
20:21 ./__spark_conf__/kms-env.sh
1064315564 -r-x--   1 my_user_namemy_user_name 760 Dec  8 
20:21 ./__spark_conf__/slaves
1064315614 -r-x--   1 my_user_namemy_user_name 945 Dec  8 
20:21 ./__spark_conf__/taskcontroller.cfg
1064315424 -r-x--   1 my_user_namemy_user_name2358 Dec  8 
20:21 ./__spark_conf__/topology_script.py
1064315394 -r-x--   1 my_user_namemy_user_name 884 Dec  8 
20:21 ./__spark_conf__/ssl-client.xml
1064315314 -r-x--   1 my_user_namemy_user_name2207 Dec  8 
20:21 ./__spark_conf__/hadoop-metrics2.properties
1064315644 -r-x--   1 my_user_namemy_user_name 506 Dec  8 
20:21 ./__spark_conf__/__spark_conf__.properties
1064315508 -r-x--   1 my_user_namemy_user_name4221 Dec  8 
20:21 ./__spark_conf__/task-log4j.properties
1064315514 -r-x--   1 my_user_namemy_user_name 856 Dec  8 
20:21 ./__spark_conf__/mapred-env.sh
106431529   12 -r-x--   1 my_user_namemy_user_name9313 Dec  8 
20:21 ./__spark_conf__/log4j.properties
1064315414 -r-x--   1 my_user_namemy_user_name3518 Dec  8 
20:21 ./__spark_conf__/kms-acls.xml
1064315348 -r-x--   1 my_user_namemy_user_name7634 Dec  8 
20:21 ./__spark_conf__/core-site.xml
1064315574 -r-x--   1 my_user_namemy_user_name2081 Dec  8 
20:21 ./__spark_conf__/topology_mappings.data
1064315494 -r-x--   1 

Re: Kerberos and YARN - functions in spark-shell and spark submit local but not cluster mode

2016-12-08 Thread Marcelo Vanzin
Then you probably have a configuration error somewhere. Since you
haven't actually posted the error you're seeing, it's kinda hard to
help any further.

On Thu, Dec 8, 2016 at 11:17 AM, Gerard Casey  wrote:
> Right. I’m confident that is setup correctly.
>
> I can run the SparkPi test script. The main difference between it and my 
> application is that it doesn’t access HDFS.
>
>> On 8 Dec 2016, at 18:43, Marcelo Vanzin  wrote:
>>
>> On Wed, Dec 7, 2016 at 11:54 PM, Gerard Casey  
>> wrote:
>>> To be specific, where exactly should spark.authenticate be set to true?
>>
>> spark.authenticate has nothing to do with kerberos. It's for
>> authentication between different Spark processes belonging to the same
>> app.
>>
>> --
>> Marcelo
>>
>> -
>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>
>



-- 
Marcelo

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: Kerberos and YARN - functions in spark-shell and spark submit local but not cluster mode

2016-12-08 Thread Gerard Casey
Right. I’m confident that is setup correctly.

I can run the SparkPi test script. The main difference between it and my 
application is that it doesn’t access HDFS. 

> On 8 Dec 2016, at 18:43, Marcelo Vanzin  wrote:
> 
> On Wed, Dec 7, 2016 at 11:54 PM, Gerard Casey  
> wrote:
>> To be specific, where exactly should spark.authenticate be set to true?
> 
> spark.authenticate has nothing to do with kerberos. It's for
> authentication between different Spark processes belonging to the same
> app.
> 
> -- 
> Marcelo
> 
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
> 


-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: Kerberos and YARN - functions in spark-shell and spark submit local but not cluster mode

2016-12-08 Thread Marcelo Vanzin
On Wed, Dec 7, 2016 at 11:54 PM, Gerard Casey  wrote:
> To be specific, where exactly should spark.authenticate be set to true?

spark.authenticate has nothing to do with kerberos. It's for
authentication between different Spark processes belonging to the same
app.

-- 
Marcelo

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: Kerberos and YARN - functions in spark-shell and spark submit local but not cluster mode

2016-12-07 Thread Gerard Casey
Thanks Marcin,

That seems to be the case. It explains why there is no documentation on this 
part too!

To be specific, where exactly should spark.authenticate be set to true?

Many thanks,

Gerry

> On 8 Dec 2016, at 08:46, Marcin Pastecki  wrote:
> 
> My understanding is that the token generation is handled by Spark itself as 
> long as you were authenticated in Kerberos when submitting the job and 
> spark.authenticate is set to true.
> 
> --keytab and --principal options should be used for "long" running job, when 
> you may need to do ticket renewal. Spark will handle it then. I may be wrong 
> though.
> 
> I guess it gets even more complicated if you need to access other secured 
> service from Spark like hbase or Phoenix, but i guess this is for another 
> discussion.
> 
> Regards,
> Marcin
> 
> 
> On Thu, Dec 8, 2016, 08:40 Gerard Casey  > wrote:
> I just read an interesting comment on cloudera:
> 
> What does it mean by “when the job is submitted,and you have a kinit, you 
> will have TOKEN to access HDFS, you would need to pass that on, or the 
> KERBEROS ticket” ?
> 
> Reference 
> 
>  and full quote:
> 
> In a cluster which is kerberised there is no SIMPLE authentication. Make sure 
> that you have run kinit before you run the application.
> Second thing to check: In your application you need to do the right thing and 
> either pass on the TOKEN or a KERBEROS ticket.
> When the job is submitted, and you have done a kinit, you will have TOKEN to 
> access HDFS you would need to pass that on, or the KERBEROS ticket.
> You will need to handle this in your code. I can not see exactly what you are 
> doing at that point in the startup of your code but any HDFS access will 
> require a TOKEN or KERBEROS ticket.
>  
> Cheers,
> Wilfred
> 
>> On 8 Dec 2016, at 08:35, Gerard Casey > > wrote:
>> 
>> Thanks Marcelo.
>> 
>> I’ve completely removed it. Ok - even if I read/write from HDFS?
>> 
>> Trying to the SparkPi example now
>> 
>> G
>> 
>>> On 7 Dec 2016, at 22:10, Marcelo Vanzin >> > wrote:
>>> 
>>> Have you removed all the code dealing with Kerberos that you posted?
>>> You should not be setting those principal / keytab configs.
>>> 
>>> Literally all you have to do is login with kinit then run spark-submit.
>>> 
>>> Try with the SparkPi example for instance, instead of your own code.
>>> If that doesn't work, you have a configuration issue somewhere.
>>> 
>>> On Wed, Dec 7, 2016 at 1:09 PM, Gerard Casey >> > wrote:
 Thanks.
 
 I’ve checked the TGT, principal and key tab. Where to next?!
 
> On 7 Dec 2016, at 22:03, Marcelo Vanzin  > wrote:
> 
> On Wed, Dec 7, 2016 at 12:15 PM, Gerard Casey  > wrote:
>> Can anyone point me to a tutorial or a run through of how to use Spark 
>> with
>> Kerberos? This is proving to be quite confusing. Most search results on 
>> the
>> topic point to what needs inputted at the point of `sparks submit` and 
>> not
>> the changes needed in the actual src/main/.scala file
> 
> You don't need to write any special code to run Spark with Kerberos.
> Just write your application normally, and make sure you're logged in
> to the KDC (i.e. "klist" shows a valid TGT) before running your app.
> 
> 
> --
> Marcelo
 
>>> 
>>> 
>>> 
>>> -- 
>>> Marcelo
>> 
> 



Re: Kerberos and YARN - functions in spark-shell and spark submit local but not cluster mode

2016-12-07 Thread Marcin Pastecki
My understanding is that the token generation is handled by Spark itself as
long as you were authenticated in Kerberos when submitting the job and
spark.authenticate is set to true.

--keytab and --principal options should be used for "long" running job,
when you may need to do ticket renewal. Spark will handle it then. I may be
wrong though.

I guess it gets even more complicated if you need to access other secured
service from Spark like hbase or Phoenix, but i guess this is for another
discussion.

Regards,
Marcin

On Thu, Dec 8, 2016, 08:40 Gerard Casey  wrote:

> I just read an interesting comment on cloudera:
>
> What does it mean by “when the job is submitted,and you have a kinit, you
> will have TOKEN to access HDFS, you would need to pass that on, or the
> KERBEROS ticket” ?
>
> Reference
> 
>  and
> full quote:
>
> In a cluster which is kerberised there is no SIMPLE authentication. Make
> sure that you have run kinit before you run the application.
> Second thing to check: In your application you need to do the right thing
> and either pass on the TOKEN or a KERBEROS ticket.
> When the job is submitted, and you have done a kinit, you will have TOKEN
> to access HDFS you would need to pass that on, or the KERBEROS ticket.
> You will need to handle this in your code. I can not see exactly what you
> are doing at that point in the startup of your code but any HDFS access
> will require a TOKEN or KERBEROS ticket.
>
>
> Cheers,
> Wilfred
>
> On 8 Dec 2016, at 08:35, Gerard Casey  wrote:
>
> Thanks Marcelo.
>
> I’ve completely removed it. Ok - even if I read/write from HDFS?
>
> Trying to the SparkPi example now
>
> G
>
> On 7 Dec 2016, at 22:10, Marcelo Vanzin  wrote:
>
> Have you removed all the code dealing with Kerberos that you posted?
> You should not be setting those principal / keytab configs.
>
> Literally all you have to do is login with kinit then run spark-submit.
>
> Try with the SparkPi example for instance, instead of your own code.
> If that doesn't work, you have a configuration issue somewhere.
>
> On Wed, Dec 7, 2016 at 1:09 PM, Gerard Casey 
> wrote:
>
> Thanks.
>
> I’ve checked the TGT, principal and key tab. Where to next?!
>
> On 7 Dec 2016, at 22:03, Marcelo Vanzin  wrote:
>
> On Wed, Dec 7, 2016 at 12:15 PM, Gerard Casey 
> wrote:
>
> Can anyone point me to a tutorial or a run through of how to use Spark with
> Kerberos? This is proving to be quite confusing. Most search results on the
> topic point to what needs inputted at the point of `sparks submit` and not
> the changes needed in the actual src/main/.scala file
>
>
> You don't need to write any special code to run Spark with Kerberos.
> Just write your application normally, and make sure you're logged in
> to the KDC (i.e. "klist" shows a valid TGT) before running your app.
>
>
> --
> Marcelo
>
>
>
>
>
> --
> Marcelo
>
>
>
>


Re: Kerberos and YARN - functions in spark-shell and spark submit local but not cluster mode

2016-12-07 Thread Gerard Casey
I just read an interesting comment on cloudera:

What does it mean by “when the job is submitted,and you have a kinit, you will 
have TOKEN to access HDFS, you would need to pass that on, or the KERBEROS 
ticket” ?

Reference 

 and full quote:

In a cluster which is kerberised there is no SIMPLE authentication. Make sure 
that you have run kinit before you run the application.
Second thing to check: In your application you need to do the right thing and 
either pass on the TOKEN or a KERBEROS ticket.
When the job is submitted, and you have done a kinit, you will have TOKEN to 
access HDFS you would need to pass that on, or the KERBEROS ticket.
You will need to handle this in your code. I can not see exactly what you are 
doing at that point in the startup of your code but any HDFS access will 
require a TOKEN or KERBEROS ticket.
 
Cheers,
Wilfred

> On 8 Dec 2016, at 08:35, Gerard Casey  wrote:
> 
> Thanks Marcelo.
> 
> I’ve completely removed it. Ok - even if I read/write from HDFS?
> 
> Trying to the SparkPi example now
> 
> G
> 
>> On 7 Dec 2016, at 22:10, Marcelo Vanzin > > wrote:
>> 
>> Have you removed all the code dealing with Kerberos that you posted?
>> You should not be setting those principal / keytab configs.
>> 
>> Literally all you have to do is login with kinit then run spark-submit.
>> 
>> Try with the SparkPi example for instance, instead of your own code.
>> If that doesn't work, you have a configuration issue somewhere.
>> 
>> On Wed, Dec 7, 2016 at 1:09 PM, Gerard Casey > > wrote:
>>> Thanks.
>>> 
>>> I’ve checked the TGT, principal and key tab. Where to next?!
>>> 
 On 7 Dec 2016, at 22:03, Marcelo Vanzin > wrote:
 
 On Wed, Dec 7, 2016 at 12:15 PM, Gerard Casey > wrote:
> Can anyone point me to a tutorial or a run through of how to use Spark 
> with
> Kerberos? This is proving to be quite confusing. Most search results on 
> the
> topic point to what needs inputted at the point of `sparks submit` and not
> the changes needed in the actual src/main/.scala file
 
 You don't need to write any special code to run Spark with Kerberos.
 Just write your application normally, and make sure you're logged in
 to the KDC (i.e. "klist" shows a valid TGT) before running your app.
 
 
 --
 Marcelo
>>> 
>> 
>> 
>> 
>> -- 
>> Marcelo
> 



Re: Kerberos and YARN - functions in spark-shell and spark submit local but not cluster mode

2016-12-07 Thread Gerard Casey
Thanks Marcelo.

I’ve completely removed it. Ok - even if I read/write from HDFS?

Trying to the SparkPi example now

G

> On 7 Dec 2016, at 22:10, Marcelo Vanzin  wrote:
> 
> Have you removed all the code dealing with Kerberos that you posted?
> You should not be setting those principal / keytab configs.
> 
> Literally all you have to do is login with kinit then run spark-submit.
> 
> Try with the SparkPi example for instance, instead of your own code.
> If that doesn't work, you have a configuration issue somewhere.
> 
> On Wed, Dec 7, 2016 at 1:09 PM, Gerard Casey  > wrote:
>> Thanks.
>> 
>> I’ve checked the TGT, principal and key tab. Where to next?!
>> 
>>> On 7 Dec 2016, at 22:03, Marcelo Vanzin  wrote:
>>> 
>>> On Wed, Dec 7, 2016 at 12:15 PM, Gerard Casey  
>>> wrote:
 Can anyone point me to a tutorial or a run through of how to use Spark with
 Kerberos? This is proving to be quite confusing. Most search results on the
 topic point to what needs inputted at the point of `sparks submit` and not
 the changes needed in the actual src/main/.scala file
>>> 
>>> You don't need to write any special code to run Spark with Kerberos.
>>> Just write your application normally, and make sure you're logged in
>>> to the KDC (i.e. "klist" shows a valid TGT) before running your app.
>>> 
>>> 
>>> --
>>> Marcelo
>> 
> 
> 
> 
> -- 
> Marcelo



Re: Kerberos and YARN - functions in spark-shell and spark submit local but not cluster mode

2016-12-07 Thread Marcelo Vanzin
Have you removed all the code dealing with Kerberos that you posted?
You should not be setting those principal / keytab configs.

Literally all you have to do is login with kinit then run spark-submit.

Try with the SparkPi example for instance, instead of your own code.
If that doesn't work, you have a configuration issue somewhere.

On Wed, Dec 7, 2016 at 1:09 PM, Gerard Casey  wrote:
> Thanks.
>
> I’ve checked the TGT, principal and key tab. Where to next?!
>
>> On 7 Dec 2016, at 22:03, Marcelo Vanzin  wrote:
>>
>> On Wed, Dec 7, 2016 at 12:15 PM, Gerard Casey  
>> wrote:
>>> Can anyone point me to a tutorial or a run through of how to use Spark with
>>> Kerberos? This is proving to be quite confusing. Most search results on the
>>> topic point to what needs inputted at the point of `sparks submit` and not
>>> the changes needed in the actual src/main/.scala file
>>
>> You don't need to write any special code to run Spark with Kerberos.
>> Just write your application normally, and make sure you're logged in
>> to the KDC (i.e. "klist" shows a valid TGT) before running your app.
>>
>>
>> --
>> Marcelo
>



-- 
Marcelo

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: Kerberos and YARN - functions in spark-shell and spark submit local but not cluster mode

2016-12-07 Thread Gerard Casey
Thanks.

I’ve checked the TGT, principal and key tab. Where to next?! 

> On 7 Dec 2016, at 22:03, Marcelo Vanzin  wrote:
> 
> On Wed, Dec 7, 2016 at 12:15 PM, Gerard Casey  
> wrote:
>> Can anyone point me to a tutorial or a run through of how to use Spark with
>> Kerberos? This is proving to be quite confusing. Most search results on the
>> topic point to what needs inputted at the point of `sparks submit` and not
>> the changes needed in the actual src/main/.scala file
> 
> You don't need to write any special code to run Spark with Kerberos.
> Just write your application normally, and make sure you're logged in
> to the KDC (i.e. "klist" shows a valid TGT) before running your app.
> 
> 
> -- 
> Marcelo


-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: Kerberos and YARN - functions in spark-shell and spark submit local but not cluster mode

2016-12-07 Thread Marcelo Vanzin
On Wed, Dec 7, 2016 at 12:15 PM, Gerard Casey  wrote:
> Can anyone point me to a tutorial or a run through of how to use Spark with
> Kerberos? This is proving to be quite confusing. Most search results on the
> topic point to what needs inputted at the point of `sparks submit` and not
> the changes needed in the actual src/main/.scala file

You don't need to write any special code to run Spark with Kerberos.
Just write your application normally, and make sure you're logged in
to the KDC (i.e. "klist" shows a valid TGT) before running your app.


-- 
Marcelo

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: Kerberos and YARN - functions in spark-shell and spark submit local but not cluster mode

2016-12-07 Thread Gerard Casey
Thanks Marcelo,

Turns out I had missed setup steps in the actual file itself. Thanks to Richard 
for the help here. He pointed me to some java implementations.

I’m using the import org.apache.hadoop.security API.

I now have:

/* graphx_sp.scala */
import scala.util.Try
import scala.io.Source
import scala.util.parsing.json._
import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
import org.apache.spark._
import org.apache.spark.sql.functions._
import org.apache.spark.sql.Row
import org.apache.spark.graphx._
import org.apache.spark.rdd.RDD
import org.apache.hadoop.security.UserGroupInformation

object graphx_sp {
def main(args: Array[String]){
// Settings
val conf = new SparkConf().setAppName("graphx_sp")
val sc = new SparkContext(conf)
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
sc.setLogLevel("WARN")

val principal = conf.get("spark.yarn.principal")
val keytab = conf.get("spark.yarn.keytab")
val loginUser = UserGroupInformation.loginUserFromKeytab(principal, 
keytab)

UserGroupInformation.getLoginUser(loginUser)
## Actual code….

Running sbt returns:

src/main/scala/graphx_sp.scala:35: too many arguments for method getLoginUser: 
()org.apache.hadoop.security.UserGroupInformation
[error] UserGroupInformation.getLoginUser(loginUser)
[error]  ^
[error] one error found
[error] (compile:compileIncremental) Compilation failed

The docs show that there should be two inputs, the principal and key tab. See 
here 
.
 

Can anyone point me to a tutorial or a run through of how to use Spark with 
Kerberos? This is proving to be quite confusing. Most search results on the 
topic point to what needs inputted at the point of `sparks submit` and not the 
changes needed in the actual src/main/.scala file

Gerry

> On 5 Dec 2016, at 19:45, Marcelo Vanzin  wrote:
> 
> That's not the error, that's just telling you the application failed.
> You have to look at the YARN logs for application_1479877553404_0041
> to see why it failed.
> 
> On Mon, Dec 5, 2016 at 10:44 AM, Gerard Casey  
> wrote:
>> Thanks Marcelo,
>> 
>> My understanding from a few pointers is that this may be due to insufficient 
>> read permissions to the key tab or a corrupt key tab. I have checked the 
>> read permissions and they are ok. I can see that it is initially configuring 
>> correctly:
>> 
>>   INFO security.UserGroupInformation: Login successful for user 
>> user@login_node using keytab file /path/to/keytab
>> 
>> I’ve added the full trace below.
>> 
>> Gerry
>> 
>> Full trace:
>> 
>> Multiple versions of Spark are installed but SPARK_MAJOR_VERSION is not set
>> Spark1 will be picked by default
>> 16/12/05 18:23:27 WARN util.NativeCodeLoader: Unable to load native-hadoop 
>> library for your platform... using builtin-java classes where applicable
>> 16/12/05 18:23:27 INFO security.UserGroupInformation: Login successful for 
>> user me@login_nodeusing keytab file /path/to/keytab
>> 16/12/05 18:23:27 INFO yarn.Client: Attempting to login to the Kerberos 
>> using principal: me@login_node and keytab: /path/to/keytab
>> 16/12/05 18:23:28 INFO impl.TimelineClientImpl: Timeline service address: 
>> http://login_node1.xcat.cluster:8188/ws/v1/timeline/
>> 16/12/05 18:23:28 INFO client.RMProxy: Connecting to ResourceManager at 
>> login_node1.xcat.cluster/
>> 16/12/05 18:23:28 INFO client.AHSProxy: Connecting to Application History 
>> server at login_node1.xcat.cluster/
>> 16/12/05 18:23:28 WARN shortcircuit.DomainSocketFactory: The short-circuit 
>> local reads feature cannot be used because libhadoop cannot be loaded.
>> 16/12/05 18:23:28 INFO yarn.Client: Requesting a new application from 
>> cluster with 32 NodeManagers
>> 16/12/05 18:23:28 INFO yarn.Client: Verifying our application has not 
>> requested more than the maximum memory capability of the cluster (15360 MB 
>> per container)
>> 16/12/05 18:23:28 INFO yarn.Client: Will allocate AM container, with 1408 MB 
>> memory including 384 MB overhead
>> 16/12/05 18:23:28 INFO yarn.Client: Setting up container launch context for 
>> our AM
>> 16/12/05 18:23:28 INFO yarn.Client: Setting up the launch environment for 
>> our AM container
>> 16/12/05 18:23:28 INFO yarn.Client: Using the spark assembly jar on HDFS 
>> because you are using HDP, 
>> defaultSparkAssembly:hdfs://login_node1.xcat.cluster:8020/hdp/apps/2.5.0.0-1245/spark/spark-hdp-assembly.jar
>> 16/12/05 18:23:28 INFO yarn.Client: Credentials file set to:
>> 16/12/05 18:23:28 INFO yarn.YarnSparkHadoopUtil: getting token for namenode: 
>> hdfs://login_node1.xcat.cluster:8020/user/me/.sparkStaging/application_
>> 16/12/05 18:23:28 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 
>> 1856 for me 

Re: Kerberos and YARN - functions in spark-shell and spark submit local but not cluster mode

2016-12-05 Thread Marcelo Vanzin
That's not the error, that's just telling you the application failed.
You have to look at the YARN logs for application_1479877553404_0041
to see why it failed.

On Mon, Dec 5, 2016 at 10:44 AM, Gerard Casey  wrote:
> Thanks Marcelo,
>
> My understanding from a few pointers is that this may be due to insufficient 
> read permissions to the key tab or a corrupt key tab. I have checked the read 
> permissions and they are ok. I can see that it is initially configuring 
> correctly:
>
>INFO security.UserGroupInformation: Login successful for user 
> user@login_node using keytab file /path/to/keytab
>
> I’ve added the full trace below.
>
> Gerry
>
> Full trace:
>
> Multiple versions of Spark are installed but SPARK_MAJOR_VERSION is not set
> Spark1 will be picked by default
> 16/12/05 18:23:27 WARN util.NativeCodeLoader: Unable to load native-hadoop 
> library for your platform... using builtin-java classes where applicable
> 16/12/05 18:23:27 INFO security.UserGroupInformation: Login successful for 
> user me@login_nodeusing keytab file /path/to/keytab
> 16/12/05 18:23:27 INFO yarn.Client: Attempting to login to the Kerberos using 
> principal: me@login_node and keytab: /path/to/keytab
> 16/12/05 18:23:28 INFO impl.TimelineClientImpl: Timeline service address: 
> http://login_node1.xcat.cluster:8188/ws/v1/timeline/
> 16/12/05 18:23:28 INFO client.RMProxy: Connecting to ResourceManager at 
> login_node1.xcat.cluster/
> 16/12/05 18:23:28 INFO client.AHSProxy: Connecting to Application History 
> server at login_node1.xcat.cluster/
> 16/12/05 18:23:28 WARN shortcircuit.DomainSocketFactory: The short-circuit 
> local reads feature cannot be used because libhadoop cannot be loaded.
> 16/12/05 18:23:28 INFO yarn.Client: Requesting a new application from cluster 
> with 32 NodeManagers
> 16/12/05 18:23:28 INFO yarn.Client: Verifying our application has not 
> requested more than the maximum memory capability of the cluster (15360 MB 
> per container)
> 16/12/05 18:23:28 INFO yarn.Client: Will allocate AM container, with 1408 MB 
> memory including 384 MB overhead
> 16/12/05 18:23:28 INFO yarn.Client: Setting up container launch context for 
> our AM
> 16/12/05 18:23:28 INFO yarn.Client: Setting up the launch environment for our 
> AM container
> 16/12/05 18:23:28 INFO yarn.Client: Using the spark assembly jar on HDFS 
> because you are using HDP, 
> defaultSparkAssembly:hdfs://login_node1.xcat.cluster:8020/hdp/apps/2.5.0.0-1245/spark/spark-hdp-assembly.jar
> 16/12/05 18:23:28 INFO yarn.Client: Credentials file set to:
> 16/12/05 18:23:28 INFO yarn.YarnSparkHadoopUtil: getting token for namenode: 
> hdfs://login_node1.xcat.cluster:8020/user/me/.sparkStaging/application_
> 16/12/05 18:23:28 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 
> 1856 for me on
> 16/12/05 18:23:28 INFO yarn.Client: Renewal Interval set to 8649
> 16/12/05 18:23:28 INFO yarn.Client: Preparing resources for our AM container
> 16/12/05 18:23:28 INFO yarn.YarnSparkHadoopUtil: getting token for namenode: 
> hdfs://login_node1.xcat.cluster:8020/user/me/.sparkStaging/application_
> 16/12/05 18:23:28 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 
> 1857 for me on
> 16/12/05 18:23:29 INFO yarn.YarnSparkHadoopUtil: HBase class not found 
> java.lang.ClassNotFoundException: org.apache.hadoop.hbase.HBaseConfiguration
> 16/12/05 18:23:29 INFO yarn.Client: To enable the AM to login from keytab, 
> credentials are being copied over to the AM via the YARN Secure Distributed 
> Cache.
> 16/12/05 18:23:29 INFO yarn.Client: Uploading resource file:/path/to/keytab 
> -> 
> hdfs://login_node1.xcat.cluster:8020/user/me/.sparkStaging/application_1479877553404_0041/keytab
> 16/12/05 18:23:29 INFO yarn.Client: Using the spark assembly jar on HDFS 
> because you are using HDP, 
> defaultSparkAssembly:hdfs://login_node1.xcat.cluster:8020/hdp/apps/2.5.0.0-1245/spark/spark-hdp-assembly.jar
> 16/12/05 18:23:29 INFO yarn.Client: Source and destination file systems are 
> the same. Not copying 
> hdfs://login_node1.xcat.cluster:8020/hdp/apps/2.5.0.0-1245/spark/spark-hdp-assembly.jar
> 16/12/05 18:23:29 INFO yarn.Client: Uploading resource 
> file:/home/me/Aoife/spark-abm/target/scala-2.10/graphx_sp_2.10-1.0.jar -> 
> hdfs://login_node1.xcat.cluster:8020/user/me/.sparkStaging/application_1479877553404_0041/graphx_sp_2.10-1.0.jar
> 16/12/05 18:23:29 INFO yarn.Client: Uploading resource 
> file:/tmp/spark-2e566133-d50a-4904-920e-ab5cec07c644/__spark_conf__6538744395325375994.zip
>  -> 
> hdfs://login_node1.xcat.cluster:8020/user/me/.sparkStaging/application_1479877553404_0041/__spark_conf__6538744395325375994.zip
> 16/12/05 18:23:29 INFO spark.SecurityManager: Changing view acls to: me
> 16/12/05 18:23:29 INFO spark.SecurityManager: Changing modify acls to: me
> 16/12/05 18:23:29 INFO spark.SecurityManager: SecurityManager: authentication 
> disabled; ui acls disabled; users with view permissions: Set(me); users with 

Re: Kerberos and YARN - functions in spark-shell and spark submit local but not cluster mode

2016-12-05 Thread Gerard Casey
Thanks Marcelo,

My understanding from a few pointers is that this may be due to insufficient 
read permissions to the key tab or a corrupt key tab. I have checked the read 
permissions and they are ok. I can see that it is initially configuring 
correctly:

   INFO security.UserGroupInformation: Login successful for user 
user@login_node using keytab file /path/to/keytab

I’ve added the full trace below. 

Gerry

Full trace:

Multiple versions of Spark are installed but SPARK_MAJOR_VERSION is not set
Spark1 will be picked by default
16/12/05 18:23:27 WARN util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
16/12/05 18:23:27 INFO security.UserGroupInformation: Login successful for user 
me@login_nodeusing keytab file /path/to/keytab
16/12/05 18:23:27 INFO yarn.Client: Attempting to login to the Kerberos using 
principal: me@login_node and keytab: /path/to/keytab
16/12/05 18:23:28 INFO impl.TimelineClientImpl: Timeline service address: 
http://login_node1.xcat.cluster:8188/ws/v1/timeline/
16/12/05 18:23:28 INFO client.RMProxy: Connecting to ResourceManager at 
login_node1.xcat.cluster/
16/12/05 18:23:28 INFO client.AHSProxy: Connecting to Application History 
server at login_node1.xcat.cluster/
16/12/05 18:23:28 WARN shortcircuit.DomainSocketFactory: The short-circuit 
local reads feature cannot be used because libhadoop cannot be loaded.
16/12/05 18:23:28 INFO yarn.Client: Requesting a new application from cluster 
with 32 NodeManagers
16/12/05 18:23:28 INFO yarn.Client: Verifying our application has not requested 
more than the maximum memory capability of the cluster (15360 MB per container)
16/12/05 18:23:28 INFO yarn.Client: Will allocate AM container, with 1408 MB 
memory including 384 MB overhead
16/12/05 18:23:28 INFO yarn.Client: Setting up container launch context for our 
AM
16/12/05 18:23:28 INFO yarn.Client: Setting up the launch environment for our 
AM container
16/12/05 18:23:28 INFO yarn.Client: Using the spark assembly jar on HDFS 
because you are using HDP, 
defaultSparkAssembly:hdfs://login_node1.xcat.cluster:8020/hdp/apps/2.5.0.0-1245/spark/spark-hdp-assembly.jar
16/12/05 18:23:28 INFO yarn.Client: Credentials file set to:
16/12/05 18:23:28 INFO yarn.YarnSparkHadoopUtil: getting token for namenode: 
hdfs://login_node1.xcat.cluster:8020/user/me/.sparkStaging/application_
16/12/05 18:23:28 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 1856 
for me on
16/12/05 18:23:28 INFO yarn.Client: Renewal Interval set to 8649
16/12/05 18:23:28 INFO yarn.Client: Preparing resources for our AM container
16/12/05 18:23:28 INFO yarn.YarnSparkHadoopUtil: getting token for namenode: 
hdfs://login_node1.xcat.cluster:8020/user/me/.sparkStaging/application_
16/12/05 18:23:28 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 1857 
for me on 
16/12/05 18:23:29 INFO yarn.YarnSparkHadoopUtil: HBase class not found 
java.lang.ClassNotFoundException: org.apache.hadoop.hbase.HBaseConfiguration
16/12/05 18:23:29 INFO yarn.Client: To enable the AM to login from keytab, 
credentials are being copied over to the AM via the YARN Secure Distributed 
Cache.
16/12/05 18:23:29 INFO yarn.Client: Uploading resource file:/path/to/keytab -> 
hdfs://login_node1.xcat.cluster:8020/user/me/.sparkStaging/application_1479877553404_0041/keytab
16/12/05 18:23:29 INFO yarn.Client: Using the spark assembly jar on HDFS 
because you are using HDP, 
defaultSparkAssembly:hdfs://login_node1.xcat.cluster:8020/hdp/apps/2.5.0.0-1245/spark/spark-hdp-assembly.jar
16/12/05 18:23:29 INFO yarn.Client: Source and destination file systems are the 
same. Not copying 
hdfs://login_node1.xcat.cluster:8020/hdp/apps/2.5.0.0-1245/spark/spark-hdp-assembly.jar
16/12/05 18:23:29 INFO yarn.Client: Uploading resource 
file:/home/me/Aoife/spark-abm/target/scala-2.10/graphx_sp_2.10-1.0.jar -> 
hdfs://login_node1.xcat.cluster:8020/user/me/.sparkStaging/application_1479877553404_0041/graphx_sp_2.10-1.0.jar
16/12/05 18:23:29 INFO yarn.Client: Uploading resource 
file:/tmp/spark-2e566133-d50a-4904-920e-ab5cec07c644/__spark_conf__6538744395325375994.zip
 -> 
hdfs://login_node1.xcat.cluster:8020/user/me/.sparkStaging/application_1479877553404_0041/__spark_conf__6538744395325375994.zip
16/12/05 18:23:29 INFO spark.SecurityManager: Changing view acls to: me
16/12/05 18:23:29 INFO spark.SecurityManager: Changing modify acls to: me
16/12/05 18:23:29 INFO spark.SecurityManager: SecurityManager: authentication 
disabled; ui acls disabled; users with view permissions: Set(me); users with 
modify permissions: Set(me)
16/12/05 18:23:29 INFO yarn.Client: Submitting application 41 to ResourceManager
16/12/05 18:23:30 INFO impl.YarnClientImpl: Submitted application 
application_1479877553404_0041
16/12/05 18:23:31 INFO yarn.Client: Application report for 
application_1479877553404_0041 (state: ACCEPTED)
16/12/05 18:23:31 INFO yarn.Client:
 client token: Token { kind: 

Re: Kerberos and YARN - functions in spark-shell and spark submit local but not cluster mode

2016-12-05 Thread Marcelo Vanzin
There's generally an exception in these cases, and you haven't posted
it, so it's hard to tell you what's wrong. The most probable cause,
without the extra information the exception provides, is that you're
using the wrong Hadoop configuration when submitting the job to YARN.

On Mon, Dec 5, 2016 at 4:35 AM, Gerard Casey  wrote:
> Hello all,
>
> I am using Spark with Kerberos authentication.
>
> I can run my code using `spark-shell` fine and I can also use `spark-submit`
> in local mode (e.g. —master local[16]). Both function as expected.
>
> local mode -
>
> spark-submit --class "graphx_sp" --master local[16] --driver-memory 20G
> target/scala-2.10/graphx_sp_2.10-1.0.jar
>
> I am now progressing to run in cluster mode using YARN.
>
> cluster mode with YARN -
>
> spark-submit --class "graphx_sp" --master yarn --deploy-mode cluster
> --executor-memory 13G --total-executor-cores 32
> target/scala-2.10/graphx_sp_2.10-1.0.jar
>
> However, this returns:
>
> diagnostics: User class threw exception:
> org.apache.hadoop.security.AccessControlException: Authentication required
>
> Before I run using spark-shell or on local mode in spark-submit I do the
> following kerberos setup:
>
> kinit -k -t ~/keytab -r 7d `whoami`
>
> Clearly, this setup is not extending to the YARN setup. How do I fix the
> Kerberos issue with YARN in cluster mode? Is this something which must be in
> my /src/main/scala/graphx_sp.scala file?
>
> Many thanks
>
> Geroid



-- 
Marcelo

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: Kerberos and YARN - functions in spark-shell and spark submit local but not cluster mode

2016-12-05 Thread Jorge Sánchez
Hi Gerard,

have you tried running in yarn-client mode? If so, do you still get that
same error?

Regards.

2016-12-05 12:49 GMT+00:00 Gerard Casey :

> Edit. From here
> 
>  I
> read that you can pass a `key tab` option to spark-submit. I thus tried
>
> *spark-submit --class "graphx_sp" --master yarn  *--keytab
> /path/to/keytab  *--deploy-mode cluster --executor-memory 13G
> --total-executor-cores 32 target/scala-2.10/graphx_sp_2.10-1.0.jar*
>
> However, the error persists
>
> Any ideas?
>
> Thanks
>
> Geroid
>
> On 5 Dec 2016, at 13:35, Gerard Casey  wrote:
>
> Hello all,
>
> I am using Spark with Kerberos authentication.
>
> I can run my code using `spark-shell` fine and I can also use
> `spark-submit` in local mode (e.g. —master local[16]). Both function as
> expected.
>
> local mode -
>
> *spark-submit --class "graphx_sp" --master local[16] --driver-memory 20G
> target/scala-2.10/graphx_sp_2.10-1.0.jar*
>
> I am now progressing to run in cluster mode using YARN.
>
> cluster mode with YARN -
>
> *spark-submit --class "graphx_sp" --master yarn --deploy-mode cluster
> --executor-memory 13G --total-executor-cores 32
> target/scala-2.10/graphx_sp_2.10-1.0.jar*
>
> However, this returns:
>
> *diagnostics: User class threw exception:
> org.apache.hadoop.security.AccessControlException: Authentication required*
>
> Before I run using spark-shell or on local mode in spark-submit I do the
> following kerberos setup:
>
> kinit -k -t ~/keytab -r 7d `whoami`
>
> Clearly, this setup is not extending to the YARN setup. How do I fix the
> Kerberos issue with YARN in cluster mode? Is this something which must be
> in my /src/main/scala/graphx_sp.scala file?
>
> Many thanks
>
> Geroid
>
>
>


Re: Kerberos and YARN - functions in spark-shell and spark submit local but not cluster mode

2016-12-05 Thread Gerard Casey
Edit. From here 

 I read that you can pass a `key tab` option to spark-submit. I thus tried

spark-submit --class "graphx_sp" --master yarn  --keytab /path/to/keytab  
--deploy-mode cluster --executor-memory 13G --total-executor-cores 32 
target/scala-2.10/graphx_sp_2.10-1.0.jar

However, the error persists

Any ideas?

Thanks

Geroid 

> On 5 Dec 2016, at 13:35, Gerard Casey  wrote:
> 
> Hello all,
> 
> I am using Spark with Kerberos authentication.
> 
> I can run my code using `spark-shell` fine and I can also use `spark-submit` 
> in local mode (e.g. —master local[16]). Both function as expected.
> 
> local mode -
> 
>   spark-submit --class "graphx_sp" --master local[16] --driver-memory 20G 
> target/scala-2.10/graphx_sp_2.10-1.0.jar
> 
> I am now progressing to run in cluster mode using YARN.
> 
> cluster mode with YARN - 
> 
>   spark-submit --class "graphx_sp" --master yarn --deploy-mode cluster 
> --executor-memory 13G --total-executor-cores 32 
> target/scala-2.10/graphx_sp_2.10-1.0.jar
> 
> However, this returns:
> 
>   diagnostics: User class threw exception: 
> org.apache.hadoop.security.AccessControlException: Authentication required
> 
> Before I run using spark-shell or on local mode in spark-submit I do the 
> following kerberos setup:
> 
>   kinit -k -t ~/keytab -r 7d `whoami`
> 
> Clearly, this setup is not extending to the YARN setup. How do I fix the 
> Kerberos issue with YARN in cluster mode? Is this something which must be in 
> my /src/main/scala/graphx_sp.scala file? 
> 
> Many thanks
> 
> Geroid 



Kerberos and YARN - functions in spark-shell and spark submit local but not cluster mode

2016-12-05 Thread Gerard Casey
Hello all,

I am using Spark with Kerberos authentication.

I can run my code using `spark-shell` fine and I can also use `spark-submit` in 
local mode (e.g. —master local[16]). Both function as expected.

local mode -

spark-submit --class "graphx_sp" --master local[16] --driver-memory 20G 
target/scala-2.10/graphx_sp_2.10-1.0.jar

I am now progressing to run in cluster mode using YARN.

cluster mode with YARN - 

spark-submit --class "graphx_sp" --master yarn --deploy-mode cluster 
--executor-memory 13G --total-executor-cores 32 
target/scala-2.10/graphx_sp_2.10-1.0.jar

However, this returns:

diagnostics: User class threw exception: 
org.apache.hadoop.security.AccessControlException: Authentication required

Before I run using spark-shell or on local mode in spark-submit I do the 
following kerberos setup:

kinit -k -t ~/keytab -r 7d `whoami`

Clearly, this setup is not extending to the YARN setup. How do I fix the 
Kerberos issue with YARN in cluster mode? Is this something which must be in my 
/src/main/scala/graphx_sp.scala file? 

Many thanks

Geroid