Re: Kerberos and YARN - functions in spark-shell and spark submit local but not cluster mode

Marcin Pastecki Wed, 07 Dec 2016 23:48:12 -0800

My understanding is that the token generation is handled by Spark itself as
long as you were authenticated in Kerberos when submitting the job and
spark.authenticate is set to true.


--keytab and --principal options should be used for "long" running job,
when you may need to do ticket renewal. Spark will handle it then. I may be
wrong though.

I guess it gets even more complicated if you need to access other secured
service from Spark like hbase or Phoenix, but i guess this is for another
discussion.

Regards,
Marcin

On Thu, Dec 8, 2016, 08:40 Gerard Casey <gerardhughca...@gmail.com> wrote:

> I just read an interesting comment on cloudera:
>
> What does it mean by “when the job is submitted,and you have a kinit, you
> will have TOKEN to access HDFS, you would need to pass that on, or the
> KERBEROS ticket” ?
>
> Reference
> <https://community.cloudera.com/t5/Advanced-Analytics-Apache-Spark/org-apache-hadoop-security-AccessControlException-SIMPLE/td-p/28082>
>  and
> full quote:
>
> In a cluster which is kerberised there is no SIMPLE authentication. Make
> sure that you have run kinit before you run the application.
> Second thing to check: In your application you need to do the right thing
> and either pass on the TOKEN or a KERBEROS ticket.
> When the job is submitted, and you have done a kinit, you will have TOKEN
> to access HDFS you would need to pass that on, or the KERBEROS ticket.
> You will need to handle this in your code. I can not see exactly what you
> are doing at that point in the startup of your code but any HDFS access
> will require a TOKEN or KERBEROS ticket.
>
>
> Cheers,
> Wilfred
>
> On 8 Dec 2016, at 08:35, Gerard Casey <gerardhughca...@gmail.com> wrote:
>
> Thanks Marcelo.
>
> I’ve completely removed it. Ok - even if I read/write from HDFS?
>
> Trying to the SparkPi example now
>
> G
>
> On 7 Dec 2016, at 22:10, Marcelo Vanzin <van...@cloudera.com> wrote:
>
> Have you removed all the code dealing with Kerberos that you posted?
> You should not be setting those principal / keytab configs.
>
> Literally all you have to do is login with kinit then run spark-submit.
>
> Try with the SparkPi example for instance, instead of your own code.
> If that doesn't work, you have a configuration issue somewhere.
>
> On Wed, Dec 7, 2016 at 1:09 PM, Gerard Casey <gerardhughca...@gmail.com>
> wrote:
>
> Thanks.
>
> I’ve checked the TGT, principal and key tab. Where to next?!
>
> On 7 Dec 2016, at 22:03, Marcelo Vanzin <van...@cloudera.com> wrote:
>
> On Wed, Dec 7, 2016 at 12:15 PM, Gerard Casey <gerardhughca...@gmail.com>
> wrote:
>
> Can anyone point me to a tutorial or a run through of how to use Spark with
> Kerberos? This is proving to be quite confusing. Most search results on the
> topic point to what needs inputted at the point of `sparks submit` and not
> the changes needed in the actual src/main/.scala file
>
>
> You don't need to write any special code to run Spark with Kerberos.
> Just write your application normally, and make sure you're logged in
> to the KDC (i.e. "klist" shows a valid TGT) before running your app.
>
>
> --
> Marcelo
>
>
>
>
>
> --
> Marcelo
>
>
>
>

Re: Kerberos and YARN - functions in spark-shell and spark submit local but not cluster mode

Reply via email to